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^ ■ Abstract 

Open distributed multi-agent systems are gaining interest in the academic community 
D • and in industry. In such open settings, agents are often coordinated using standardized 

agent conversation protocols. The representation of such protocols (for analysis, valida- 
' tion, monitoring, etc) is an important aspect of multi-agent applications. Recently, Petri 

nets have been shown to be an interesting approach to such representation, and radically 
different approaches using Petri nets have been proposed. However, their relative strengths 
and weaknesses have not been examined. Moreover, their scalability and suitability for 
different tasks have not been addressed. This paper addresses both these challenges. First, 
r/3 \ we analyze existing Petri net representations in terms of their scalability and appropriate- 

ness for overhearing, an important task in monitoring open multi-agent systems. Then, 
building on the insights gained, we introduce a novel representation using Colored Petri 
nets that explicitly represent legal joint conversation states and messages. This represen- 
tation approach offers significant improvements in scalability and is particularly suitable 
for overhearing. Furthermore, we show that this new representation offers a comprehen- 
sive coverage of all conversation features of FIPA conversation standards. We also present 
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■ a procedure for transforming AUML conversation protocol diagrams (a standard human- 
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readable representation), to our Colored Petri net representation. 



1. Introduction 



Open distributed multi-agent systems (MAS) are composed of multiple, independently-built 
agents that carry out mutually-dependent tasks. In order to allow inter-operability of agents 
of different designs and implementation, the agents often coordinate using standardized in- 
teraction protocols, or conversations. Indeed, the multi-agent community has been investing 
a significant effort in developing standardized Agent Communication Languages (ACL) to fa- 
cilitate sophisticated multi-agent systems (Finin, Labrou, & Mayfield, 1997; Kone, Shimazu, 
& Nakajima, 2000; ChaibDraa, 2002; FIPA site, 2003). Such standards define communica- 
tive acts, and on top of them, interaction protocols, ranging from simple queries as to the 
state of another agent, to complex negotiations by auctions or bidding on contracts. For 
instance, the FIPA Contract Net Interaction Protocol (FIPA Specifications, 2003b) defines 
a concrete set of message sequences that allows the interacting agents to use the contract 
net protocol for negotiations. 

Various formalisms have been proposed to describe such standards (e.g., Smith &; Cohen, 
1996; Parunak, 1996; Odell, Parunak, & Bauer, 2000, 2001b; AUML site, 2003). In particu- 
lar, AUML— Agent Unified Modelling Language-is currently used in the FIPA- ACL standards 
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(FIPA Specifications, 2003a, 2003b, 2003c, 2003d; Odell, Parunak, & Bauer, 2001a) *. UML 
2.0 (AUML site, 2003), a new emerging standard influenced by AUML, has the potential to 
become the FIPA-ACL standard (and a forthcoming IEEE standard) in the future. How- 
ever, for the moment, a large set of FIPA specifications remains formalized using AUML. 
While AUML is intended for human readability and visualization, interaction protocols 
should ideally be represented in a way that is amenable to automated analysis, validation 
and verification, online monitoring, etc. 

Lately, there is increasing interest in using Petri nets (Petri Nets site, 2003) in modelling 
multi-agent interaction protocols (Cost, 1999; Cost, Chen, Finin, Labrou, &; Peng, 1999, 
2000; Lin, Norrie, Shen, & Kremer, 2000; Nowostawski, Purvis, &; Cranefield, 2001; Purvis, 
Hwang, Purvis, Cranefield, & Schievink, 2002; Cranefield, Purvis, Nowostawski, & Hwang, 
2002; Ramos, Frausto, & Camargo, 2002; Mazouzi, Fallah-Seghrouchni, & Haddad, 2002; 
Poutakidis, Padgham, & Winikoff, 2002). There is broad literature on using Petri nets to 
analyze the various aspects of distributed systems (e.g. in deadlock detection as shown by 
Khomenco & Koutny, 2000), and there has been recent work on specific uses of Petri nets in 
multi-agent systems, e.g., in validation and testing (Desel, Oberweis, & Zimmer, 1997), in 
automated debugging and monitoring (Poutakidis et al., 2002), in dynamic interpretation of 
interaction protocols (Cranefield et al., 2002; de Silva, Winikoff, & Liu, 2003), in modelling 
agents behavior induced by their participation in a conversation (Ling & Loke, 2003) and 
in interaction protocols refinement allowing modular construction of complex conversations 
(Hameurlain, 2003). 

However, key questions remain open on the use of Petri nets for conversation represen- 
tation. First, while radically different approaches to representation using Petri nets have 
been proposed, their relative strengths and weaknesses have not been investigated. Second, 
many investigations have only addressed restricted subsets of the features needed in repre- 
senting complex conversations such as those standardized by FIPA (see detailed discussion 
of previous work in Section 2). Finally, no procedures have been proposed for translating 
human-readable AUML protocol descriptions into the corresponding machine-readable Petri 
nets. 

This paper addresses these open challenges in the context of scalable overhearing. Here, 
an overhearing agent passively tracks many concurrent conversations involving multiple par- 
ticipants, based solely on their exchanged messages, while not being a participant in any of 
the overheard conversations itself (Novick & Ward, 1993; Busetta, Serafini, Singh, & Zini, 
2001; Kaminka, Pynadath, & Tambe, 2002; Poutakidis et al., 2002; Busetta, Dona, & Nori, 
2002; Legras, 2002; Gutnik & Kaminka, 2004a; Rossi & Busetta, 2004). Overhearing is use- 
ful in visualization and progress monitoring (Kaminka et al., 2002), in detecting failures in 
interactions (Poutakidis et al., 2002), in maintaining organizational and situational aware- 
ness (Novick & Ward, 1993; Legras, 2002; Rossi & Busetta, 2004) and in non-obtrusively 
identifying opportunities for offering assistance (Busetta et al., 2001, 2002). For instance, an 
overhearing agent may monitor the conversation of a contractor agent engaged in multiple 
contract-net protocols with different bidders and bid callers, in order to detect failures. 

We begin with an analysis of Petri net representations, with respect to scalability and 
overhearing. We classify representation choices along two dimensions affecting scalability: 

1. (FIPA Specifications, 2003c) is currently deprecated. However, we use this specification since it describes 
many important features needed in modelling multi-agent interactions. 
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(i) the technique used to represent multiple concurrent conversations; and (ii) the choice 
of representing either individual or joint interaction states. We show that while the run- 
time complexity of monitoring conversations using different approaches is the same, choices 
along these two dimensions have significantly different space requirements, and thus some 
choices are more scalable (in the number of conversations) than others. We also argue that 
representations suitable for overhearing require the use of explicit message places, though 
only a subset of previously-explored techniques utilized those. 

Building on the insights gained, the paper presents a novel representation that uses 
Colored Petri nets (CP-nets) in which places explicitly denote messages, and valid joint 
conversation states. This representation is particularly suited for overhearing as the number 
of conversations is scaled-up. We show how this representation can be used to represent 
essentially all features of FIPA AUML conversation standards, including simple and com- 
plex interaction building blocks, communicative act attributes such as message guards and 
cardinalities, nesting, and temporal aspects such as deadlines and duration. 

To realize the advantages of machine-readable representations, such as for debugging 
(Poutakidis et al., 2002), existing human-readable protocol descriptions must be converted 
to their corresponding Petri net representations. As a final contribution in this paper, we 
provide a skeleton semi-automated procedure for converting FIPA conversation protocols 
in AUML to Petri nets, and demonstrate its use on a complex FIPA protocol. While this 
procedure is not fully automated, it takes a first step towards addressing this open challenge. 

This paper is organized as follows. Section 2 presents the motivation for our work. 
Sections 3 through 6 then present the proposed representation addressing all FIPA conver- 
sation features including basic interaction building blocks (Section 3), message attributes 
(Section 4), nested & interleaved interactions (Section 5), and temporal aspects (Section 6). 
Section 7 ties these features together: It presents a skeleton algorithm for transforming an 
AUML protocol diagram to its Petri net representation, and demonstrates its use on a chal- 
lenging FIPA conversation protocol. Section 8 concludes. The paper rounds up with three 
appendixes. The first provides a quick review of Petri nets. Then, to complete coverage of 
FIPA interactions, Appendix B provides additional interaction building blocks. Appendix C 
presents a Petri net of a complex conversation protocol, which integrates many of the features 
of the developed representation technique. 

2. Representations for Scalable Overhearing 

Overhearing involves monitoring conversations as they progress, by tracking messages that 
are exchanged between participants (Gutnik & Kaminka, 2004a). We are interested in repre- 
sentations that can facilitate scalable overhearing, tracking many concurrent conversations, 
between many agents. We focus on open settings, where the complex internal state and con- 
trol logic of agents is not known in advance, and therefore exclude discussions of Petri net 
representations which explicitly model agent internals (e.g., Moldt & Wienberg, 1997; Xu 
&; Shatz, 2001). Instead, we treat agents as black boxes, and consider representations that 
commit only to the agent's conversation state (i.e., its role and progress in the conversation). 

The suitability of a representation for scalable overhearing is affected by several facets. 
First, since overhearing is based on tracking messages, the representation must be able to 
explicitly represent the passing of a message (communicative act) from one agent to another 
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(Section 2.1). Second, the representation must facilitate tracking of multiple concurrent 
conversations. While the tracking runtime is bounded from below by the number of messages 
(since in any case, all messages are overheard and processed), space requirements may differ 
significantly (see Sections 2.2-2.3). 

2.1 Message-monitoring versus state-monitoring 

We distinguish two settings for tracking the progress of conversations, depending on the 
information available to the tracking agent. In the first type of setting, which we refer to 
as state monitoring, the tracking agent has access to the internal state of the conversation 
in one or more of the participants, but not necessarily to the messages being exchanged. 
The other settings involves message monitoring, where the tracking agent has access only to 
the messages being exchanged (which are externally observable), but cannot directly observe 
the internal state of the conversation in each participant. Overhearing is a form of message 
monitoring. 

Representations that support state monitoring use places to denote the conversation 
states of the participants. Tokens placed in these places (the net marking) denote the 
current state. The sending or receiving of a message by a participant is not explicitly 
represented, and is instead implied by moving tokens (through transition firings) to the new 
state places. Thus, such a representation essentially assumes that the internal conversation 
state of participants is directly observable by the monitoring agent. Previous work utilizing 
state monitoring includes work by Cost (1999), Cost et al. (1999, 2000), Lin et al. (2000), 
Mazouzi et al. (2002), Ramos et al. (2002). 

The representation we present in this paper is intended for overhearing tasks, and cannot 
assume that the conversation states of overheard agents are observable. Instead, it must 
support message monitoring, where in addition to using tokens in state places (to denote 
current conversation state), the representation uses message places, where tokens are placed 
when a corresponding message is overheard. A conversation-state place and a message 
place are connected via a transition to a state place denoting the new conversation state. 
Tokens placed in these originating places-indicating a message was received at an appropriate 
conversation state-will cause the transition to fire, and for the tokens to be placed in the 
new conversation state place. Thus the new conversation state is inferred from "observing" 
a message. Previous investigations, that have used explicit message places, include work 
by Cost (1999), Cost et al. (1999, 2000), Nowostawski et al. (2001), Purvis et al. (2002), 
Cranefield et al. (2002), Poutakidis et al. (2002) 2 . These are discussed in depth below. 

2.2 Representing a Single Conversation 

Two representation variants are popular within those that utilize conversation places (in 
addition to message places): Individual state representations use separate places and tokens 
for the state of each participant (each role). Thus, the overall state of the conversation is 
represented by different tokens marking multiple places. Joint state representations use a 
single place for each joint conversation state of all participants. The placement of a token 



2. Cost (1999), Cost et al. (1999, 2000) present examples of both state- and message- monitoring represen- 
tations. 
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within such a place represents the overhearing agent's belief that the participants are in the 
appropriate joint state. 

Most previous representations use individual states. In these, different markings distin- 
guish a conversation state where one agent has sent a message, from a state where the other 
agent received it. The net for each conversation role is essentially built separately, and is 
merged with the other nets, or connected to them via fusion places or similar means. 

Cost (1999), Cost et al. (1999, 2000) have used CP-nets with individual state places for 
representing KQML and FIPA interaction protocols. Transitions represent message events, 
and CP-net features, such as token colors and arc expressions, are used to represent AUML 
message attributes and sequence expressions. The authors also point out that deadlines (a 
temporal aspect of interaction) can be modelled, but no implementation details are provided. 
Cost (1999) also proposed using hierarchical CP-nets to represent hierarchical multi-agent 
conversations. 

Purvis et al. (2002), Cranefield et al. (2002) represented conversation roles as separate 
CP-nets, where places denote both interaction messages and states, while transitions repre- 
sent operations performed on the corresponding communicative acts such as send, receive, 
and process. Special in/out places are used to pass net tokens between the different CP-nets, 
through special get /put transitions, simulating the actual transmission of the corresponding 
communicative acts. 

In principle, individual-state representations require two places in each role, for every 
message. For a given message, there would be two individual places for the sender (before 
sending and after sending), and similarly two more for each receiver (before receiving and 
after receiving). All possible conversation states-valid or not-can be represented. For a 
single message and two roles, there are two places for each role (four places total), and four 
possible conversation states: message sent and received, sent and not received, not sent but 
incorrectly believed to have been received, not sent and not received. These states can be 
represented by different markings. For instance, a conversation state where the message has 
been sent but not received is denoted by a token in the 'after- sending' place of the sender 
and another token in the 'before-receiving' place of the receiver. This is summarized in the 
following proposition: 

Proposition 1 Given a conversation with R roles and a total of M possible messages, an 
individual state representation has space complexity of O(MR). 

While the representations above all represent each role's conversation state separately, 
many applications of overhearing only require representation of valid conversation states 
(message not sent and not received, or sent and received). Indeed, specifications for inter- 
action protocols often assume the use of underlying synchronization protocols to guarantee 
delivery of messages (Paurobally & Cunningham, 2003; Paurobally, Cunningham, &; Jen- 
nings, 2003). Under such an assumption, for every message, there are only two joint states 
regardless of the number of roles. For example, for a single message and three roles-a 
sender and two receivers, there are two places and two possible markings: A token in a 
before sending/receiving place represents a conversation state where the message has not 
yet been sent by the sender (and the two receivers are waiting for it), while a token in a 
after sending /receiving place denotes that the message has been sent and received by both 
receivers. 
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Nowostawski et al. (2001) utilize CP-nets where places denote joint conversation states. 
They also utilize places representing communicative acts. Poutakidis et al. (2002) proposed 
a representation based on Place- Transition nets (PT-nets)-a more restricted representation 
of Petri nets that has no color. They presented several interaction building blocks, which 
could then fit together to model additional conversation protocols. In general, the following 
proposition holds with respect to such representations: 

Proposition 2 Given a conversation with R roles and a total of M possible messages, a 
joint state representation that represents only legal states has space complexity ofO(M). 

The condition of representing only valid states is critical to the complexity analysis. If all 
joint conversation states-valid and invalid-are to be represented, the space complexity would 
beO(M R ). In such £1 CclSG, till individual-state representation would have an advantage. This 
would be the case, for instance, if we do not assume the use of synchronization protocols, 
e.g., where the overhearing agent may wish to track the exact system state even while a 
message is underway (i.e., sent and not yet received). 

2.3 Representing Multiple Concurrent Conversations 

Propositions 1 and 2 above address the space complexity of representing a single conver- 
sation. However, in large scale systems an overhearing agent may be required to monitor 
multiple conversations in parallel. For instance, an overhearing agent may be monitoring a 
middle agent that is carrying multiple parallel instances of a single interaction protocol with 
multiple partners, e.g., brokering (FIPA Specifications, 2003a). 

Some previous investigations propose to duplicate the appropriate Petri net representa- 
tion for each monitored conversation (Nowostawski et al., 2001; Poutakidis et al., 2002). In 
this approach, every conversation is tracked by a separate Petri-net, and thus the number 
of Petri nets (and their associated tokens) grows with the number of conversations (Propo- 
sition 3). For instance, Nowostawski et al. (2001) shows an example where a contract-net 
protocol is carried out with three different contractors, using three duplicate CP-nets. This 
is captured in the following proposition: 

Proposition 3 A representation that creates multiple instances of a conversation Petri net 
to represent C conversations, requires 0(C) net structures, and 0(C) bits for all tokens. 

Other investigations take a different approach, in which a single CP-net structure is used 
to monitor all conversations of the same protocol. The tokens associated with conversations 
are differentiated by their token color (Cost, 1999; Cost et al., 1999, 2000; Lin et al, 2000; 
Mazouzi et al, 2002; Cranefield et al., 2002; Purvis et al, 2002; Ramos et al., 2002). For 
example, by assigning each token a color of the tuple type (sender, receiver) , an agent can 
differentiate multiple tokens in the same place and thus track conversations of different pairs 
of agents 3 . Color tokens use multiple bits per token; up to logC bits are required to dif- 
ferentiate C conversations. Therefore, the number of bits required to track C conversations 
using C tokens is ClogC. This leads to the following proposition. 

3. See Section 4 to distinguish between different conversations by the same agents. 
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Proposition 4 A representation that uses color tokens to represent C multiple instances of 
a conversation, requires O(l) net structures, and 0(C log C) bits for all tokens. 

Due to the constants involved, the space requirements of Proposition 3 are in practice 
much more expensive than those of Proposition 4. Proposition 3 refers to the creation of 
0(C) Petri networks, each with duplicated place and transition data structures. In contrast, 
Proposition 4 refers to bits required for representing C color tokens on a single CP net. 
Moreover, in most practical settings, a sufficiently large constant bound on the number of 
conversations may be found, which will essentially reduce the O(logC) factor to 0(1). 

Based on Propositions 1-4, it is possible to make concrete predictions as to the scalability 
of different approaches with respect to the number of conversations, roles. Table 1 shows 
the space complexity of different approaches when modelling C conversations of the same 
protocol, each with a maximum of R roles, and M messages, under the assumption of 
underlying synchronization protocols. The table also cites relevant previous work. 





Representing Multiple Conversations (of Same Protocol) 




Multiple CP- or PT-nets 

(Proposition 3) 


Using color tokens, single CP-net 

(Proposition 4) 


Individual 

States 

(Proposition 1) 


Space: 0(MRC) 


Space: 0(MR + Clog C) 
Cost (1999), Cost et al. (1999, 2000), 
Lin et al. (2000), Cranefield et al. (2002), 
Purvis et al. (2002), Ramos et al. (2002), 
Mazouzi et al. (2002) 


Joint 
States 

(Proposition 2) 


Space: O(MC) 
Nowostawski et al. (2001), 
Poutakidis et al. (2002) 


Space: 0(M + C log C) 
This paper 



Table 1: Scalability of different representations 

Building on the insights gained from Table 1 , we propose a representation using CP-nets 
where places explicitly represent joint conversation states (corresponding to the lower-right 
cell in Table 1), and tokens color is used to distinguish concurrent conversations (as in the 
upper-right cell in Table 1). As such, it is related to the works that have these features, but 
as the table demonstrates, is a novel synthesis. 

Our representation uses similar structures to those found in the works of Nowostawski 
et al. (2001) and Poutakidis et al. (2002). However, in contrast to these previous investi- 
gations, we rely on token color in CP-nets to model concurrent conversations, with space 
complexity 0(M + ClogC). We also show (Sections 3-6) how it can be used to cover a 
variety of conversation features not covered by these investigations. These features include 
representation of a full set of FIPA interaction building blocks, communicative act attributes 
(such as message guards, sequence expressions, etc.), compact modelling of concurrent con- 
versations, nested and interleaved interactions, and temporal aspects. 
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3. Representing Simple & Complex Interaction Building Blocks 

This section introduces the fundamentals of our representation, and demonstrates how var- 
ious simple and complex AUML interaction messages, used in FIPA conversation standards 
(FIPA Specifications, 2003c), can be implemented using the proposed CP-net representa- 
tion. We begin with a simple conversation, shown in Figure 1-a using an AUML protocol 
diagram. Here, agent\ sends an asynchronous message msg to agents- 




A R 

INTER- 11 
STATE 



<s,r> 



msg 




MSG 



INTER- 
STATE 



color AGENT = 
color INTER-STATE = 

record a., : AGE NT* 
a 2 :AGENT; 
color MSG = record 

s:AGENT*r:AGENT; 
var s,r:AGENT; 



(a) AUML representation 



(b) CP-net representation 



Figure 1: Asynchronous message interaction. 



To represent agent conversation protocols, we define two types of places, corresponding 
to messages and conversation states. The first type of net places, called message places, is 
used to describe conversation communicative acts. Tokens placed in message places indicate 
that the associated communicative act has been overheard. The second type of net places, 
agent places, is associated with the valid joint conversation states of the interacting agents. 
Tokens placed in agent places indicate the current joint state of the conversation within the 
interaction protocol. 

Transitions represent the transmission and receipt of communicative acts between agents. 
Assuming underlying synchronization protocols, a transition always originates within a joint- 
state place and a message place, and targets a joint conversation state (more than one is 
possible-see below). Normally, the current conversation state is known (marked with a 
token), and must wait the overhearing of the matching message (denoted with a token at 
the connected message place). When this token is marked, the transition fires, automatically 
marking the new conversation state. 

Figure 1-b presents CP net representation of the earlier example of Figure 1-a. The CP- 
net in Figure 1-b has three places and one transition connecting them. The A\B\ and the 
A2B2 places are agent places, while the msg place is a message place. The A and B capital 
letters are used to denote the agent\ and the agent2 individual interaction states respectively 
(we have indicated the individual and the joint interaction states over the AUML diagram 
in Figure 1-a, but omit these annotations in later figures). Thus, the A\B\ place indicates a 
joint interaction state where agent\ is ready to send the msg communicative act to agents 
(^4i) and agent2 is waiting to receive the corresponding message (Bi). The msg message 
place corresponds to the msg communicative act sent between the two agents. Thus, the 
transmission of the msg communicative act causes the agents to transition to the A2B2 
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place. This place corresponds to the joint interaction state in which agenti has already sent 
the msg communicative act to agents {A2) and agents has received it (B2). 

The CP-net implementation in Figure 1-b also introduces the use of token colors to 
represent additional information about interaction states and communicative acts. The 
token color sets are defined in the net declaration, i.e. the dashed box in Figure 1-b. 
The syntax follows the standard CPN ML notation (Wikstrom, 1987; Milner, Harper, & 
Tofte, 1990; Jensen, 1997a). The AGENT color identifies the agents participating in the 
interaction, and is used to construct the two compound color sets. 

The INTER-STATE color set is associated with agent places, and represents agents in 
the appropriate joint interaction states. It is a record (a\, CI2), where a\ and 0,2 are AGENT 
color elements distinguishing the interacting agents. We apply the INTER-STATE color 
set to model multiple concurrent conversations using the same CP-net. The second color 
set is MSG, describing interaction communicative acts and associated with message places. 
The MSG color token is a record (a s ,a r ), where a s and a r correspond to the sender and 
the receiver agents of the associated communicative act. In both cases, additional elements, 
such as conversation identification, may be used. See Section 4 for additional details. 

In Figure 1-b, the ^4i-Bi and the A2B2 places are associated with the INTER-STATE 
color set, while the msg place is associated with the MSG color set. The place color set 
is written in italic capital letters next to the corresponding place. Furthermore, we use 
the s and r AGENT color type variables to denote the net arc expressions. Thus, given 
that the output arc expression of both the A\B\ and the msg places is (s,r), the s and r 
elements of the agent place token must correspond to the s and r elements of the message 
place token. Consequently, the net transition occurs if and only if the agents of the message 
correspond to the interacting agents. The A2B2 place input arc expression is (r, s) following 
the underlying intuition that agent2 is going to send the next interaction communicative 
act. 

Figure 2-a shows an AUML representation of another interaction building block, syn- 
chronous message passing, denoted by the filled solid arrowhead. Here, the msg commu- 
nicative act is sent synchronously from agenti to agent2 , meaning that an acknowledgement 
on msg communicative act must always be received by agent\ before the interaction may 
proceed. 

The corresponding CP-net representation is shown in Figure 2-b. The interaction starts 
in the A\B\ place and terminates in the A2B2 place. The A\B\ place represents a joint 
interaction state where agent\ is ready to send the msg communicative act to agent2 (Ai) 
and agent2 is waiting to receive the corresponding message (Bi). The A2B2 place denotes 
a joint interaction state, in which agenti has already sent the msg communicative act to 
agent2 {A2) and agent2 has received it (-62)- However, since the CP-net diagram represents 
synchronous message passing, the msg communicative act transmission cannot cause the 
agents to transition directly from the A\B\ place to the A2B2 place. We therefore define an 
intermediate A^B^ agent place. This place represents a joint interaction state where agent2 
has received the msg communicative act and is ready to send an acknowledgement on it 
(Bi), while agenti is waiting for that acknowledgement (A^). Taken together, the msg 
communicative act causes the agents to transition from the ^4i-Bi place to the A^B^ place, 
while the acknowledgement on the msg message causes the agents to transition from the 
A^B^ place to the A2B2 place. 
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agentl 



aqent2 



msg 



INTER- 
STATE 



INTER- 
STATE 



msg 



MSG 
s> ack-msg 




INTER- 
STATE { ) A 2 B 2 



(a) AUML representation 



'color AGENT = 

color INTER-STATE = record 
a r AGENT*a 2 : AGENT; 

color MSG = record s:AGENT*r:AGENT; 
1 var s.rAGENT; 

V s 

(b) CP-net representation 



Figure 2: Synchronous message interaction. 

Transitions in a typical multi-agent interaction protocols are composed of interaction 
building blocks, two of which have been presented above. Additional interaction building- 
blocks, which are fairly straightforward (or have appeared in previous work, e.g., Poutakidis 
et al., 2002) are presented in Appendix B. In the remainder of this section, we present two 
complex interactions building blocks that are generally common in multi- agent interactions: 
XOR-decision and OR-parallel. 

We begin with the XOR-decision interaction. The AUML representation to this building 
block is shown in Figure 3-a. The sender agent agenti can either send message msg\ to 
agenti or message msgi to agents, but it can not send both msgi and msg2- The non- filled 
diamond with an 'x' inside is the AUML notation for this constraint. 



INTER- 
STATE-3 




INTER- 
STATE 



(a) AUML representation 



A 2 C 2 

(b) CP-net representation 



color AGENT = ... ; 
color INTER-STATE = record 
a r AGENT*a 2 AGENT; 
color INTER-STATE-3=record 
a r AGENT*a 2 AGENT 
*a 3 :AGENT; 
color MSG = record 

s:AGENT*r: AGENT; 
vars,r.,,r 2 :AGENT; 



Figure 3: XOR-decision messages interaction. 

Figure 3-b shows the corresponding CP-net. Again, the A, B and C capital letters 
are used to denote the interaction states of agenti, agenti and agents, respectively. The 
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interaction starts from the A\B\C\ place and terminates either in the A2B2 place or in the 
A2C2 place. The A\B\C\ place represents a joint interaction state where agenti is ready to 
send either the msgi communicative act to agent2 or the msg2 communicative act to agent% 
(Ai); and agent2 and agents are waiting to receive the corresponding msg\/msg2 message 
(-B1/C1). To represent the A\B\Ci place color set, we extend the INTER-STATE color 
set to denote a joint interaction state of three interacting agents, i.e. using the INTER- 
STATE-3 color set. The msg\ communicative act causes the agents to transition to A2B2 
place. The A2B2 place represents a joint interaction state where agenti has sent the msg\ 
message (A2), an d agent2 has received it {B2). Similarly, the msg2 communicative act causes 
agents agenti an d agents to transition to A2C2 place. Exclusiveness is achieved since the 
single agent token in A\B\C\ place can be used either for activating the A\B\C\ — > A2B2 
transition or for activating the A\B\Ci — > A2C2 transition, but not both. 

A similar complex interaction is the OR-parallel messages interaction. Its AUML repre- 
sentation is presented in Figure 4-a. The sender agent, agenti, can send message msg± to 
agent2 or message msg2 to agents, or both. The non- filled diamond is the AUML notation 
for this constraint. 




1 color AGENT = 
color INTER-STATE = record 
a.,:AGENT*a 2 :AGENT; 
color INTER-STATE-3=record 
a r AGENT*a 2 :AGENT 
*a 3 :AGENT; 
color MSG = record 

s:AGENT*r:AGENT; 
var s,r,,r 2 :AGENT; 



(a) AUML representation 



AjCj 

(b) CP-net representation 



Figure 4: OR-parallel messages interaction. 



Figure 4-b shows the CP-net representation of the OR-parallel interaction. The inter- 
action starts from the A\B\C\ place but it can be terminated in the A2B2 place, or in the 
A2C2 place, or in both. To represent this inclusiveness of the interaction protocol, we define 
two intermediate places, the A[Bi place and the A'[C\ place. The A\B\ place represents a 
joint interaction state where agenti is ready to send the msg\ communicative act to agent2 
(Ai) and agent2 is waiting to receive the message (Bi). The A'[C\ place has similar mean- 
ing, but with respect to agent^. As normally done in Petri nets, the transition connecting 
the A\B\C\ place to the intermediate places duplicates any single token in A\B\C\ place 
into two tokens going into the A\B\ and the A'{C\ places. Consequently, the two parts of 
the OR-parallel interaction can be independently executed. 



4. Representing Interaction Attributes 

We now extend our representation to allow additional interaction aspects, useful in de- 
scribing multi-agent conversation protocols. First, we show how to represent interaction 
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message attributes, such as guards, sequence expressions, cardinalities and content (FIPA 
Specifications, 2003c). We then explore in depth the representation of multiple concurrent 
conversations (on the same CP net). 

Figure 5-a shows a simple agent interaction using an AUML protocol diagram. This 
interaction is similar to the one presented in Figure 1-a in the previous section. However, 
Figure 5-a uses an AUML message guard-condition-marked as [condition] -that has the 
following semantics: the communicative act is sent from agenti to agenti if and only if the 
condition is true. 



agenti 



msg 



[condition] 



aqent2 



A R 

INTER- & 




INTER- 
STATE 



' coldrXG~ENT"=" .7." 

color TYPE = 

color CONTENT = 
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a r AGENT*a 2 :AGENT; 
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s:AGENT*r:AGENT* 
t:TYPE*c: CONTENT; 

var s,r:AGENT; var t:TYPE; 

^varc: CONTENT; 



(a) AUML representation 



(b) CP-net representatL 



Figure 5: Message guard-condition 



The guard-condition implementation in our Petri net representation uses transition 
guards (Figure 5-b), a native feature for CP nets. The AUML guard condition is mapped 
directly to the CP-net transition guard. The CP-net transition guard is indicated on the 
net inscription next to the corresponding transition using square brackets. The transition 
guard guarantees that the transition is enabled if and only if the transition guard is true. 

In Figure 5-b, we also extend the color of tokens to include information about the 
communicative act being used and its content. We extend the MSG color set definition 
to a record (s, r, t, c), where the s and r elements has the same interpretation as in previous 
section (sender and receiver), and the t and c elements define the message type and content, 
respectively. The t element is of a new color TYPE, which determines communicative act 
types. The c element is of a new color CONTENT, which represents communicative act 
content and argument list (e.g. reply-to, reply-by and etc). 

The addition of new elements also allows for additional potential uses. For instance, 
to facilitate representation of multiple concurrent conversations between the same agents 
(s and r), it is possible to add a conversation identification field to both the MSG and 
INTER-STATE colors. For simplicity, we refrain from doing so in the examples in this 
paper. 

Two additional AUML communicative act attributes that can be modelled in the CP 
representation are message sequence-expression and message cardinality. The sequence- 
expressions denote a constraint on the message sent from sender agent. There are a number of 
sequence-expressions defined by FIPA conversation standards (FIPA Specifications, 2003c): 
m denotes that the message is sent exactly m times; n..m denotes that the message is sent 
anywhere from n up to m times; * denotes that the message is sent an arbitrary number of 
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times. An additional important sequence expression is broadcast, i.e. message is sent to all 
other agents. 

We now explain the representation of sequence-expressions in CP-nets, using broadcast 
as an example (Figure 6-b). Other sequence expressions are easily derived from this example. 
We define an INTER-STATE-CARD color set. This color set is a tuple ((a\, 02), i) consisting 
of two elements. The first tuple element is an INTER-STATE color element, which denotes 
the interacting agents as previously defined. The second tuple element is an integer that 
counts the number of messages already sent by an agent, i.e. the message cardinality. 
This element is initially assigned to 0. The INTER-STATE-CARD color set is applied to 
the S1R1 place, where the S and R capital letters are used to denote the sender and the 
receiver individual interaction states respectively and the S\R\ indicates the initial joint 
interaction state of the interacting agents. The two additional colors, used in Figure 6-b, are 
the BROADCAST-LIST and the TARGET colors. The BROADCAST-LIST color defines 
the sender broadcast list of the designated receivers, assuming that the sender must have 
such a list to carry out its role. The TARGET color defines indexes into this broadcast list. 
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msg2 
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INTER- X 
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color INTER-STATE = record a r AGENT* 
a 2 :AGENT; 

color CARD = int; 

color INTER-STATE-CARD = product 

INTER-STATE*CARD; 
color MSG = record s:AGENT*r:AGENT* 
t:TYPE*c:CONTENT; 
color BROADCAST-LIST = AGENT with . ; 
val size = ...; 

color TARGET = index BROADCAST-LIST 
with 0...size-1; 
S3R3 '^var s,r:AGENT; varmsg:MSG; var i:CARD; i 



(a) AUML representation 



(b) CP-net representation 



Figure 6: Broadcast sequence expression. 

According to the broadcast sequence-expression semantics, the sender agent sends the 
same msg\ communicative act to all the receivers on the broadcast list. The CP-net in- 
troduced in Figure 6-b models this behavior. The interaction starts from the S\R\ place, 
representing the joint interaction state where sender is ready to send the msg\ commu- 
nicative act to receiver (Si) and receiver is waiting to receive the corresponding msg\ 
message (R\). The S\R\ place initial marking is a single token, set by the initializa- 
tion expression (underlined, next to the corresponding place). The initialization expres- 
sion V((s,TARGET(0)),0)-given in standard CPN ML notation-determines that the SiRx 
place's initial marking is a multi-set containing a single token ((s, TARGET(0)), 0). Thus, 
the first designated receiver is assigned to be the agent with index on the broadcast list, 
and the message cardinality counter is initiated to 0. 



4. We implement broadcast as an iterative procedure sending the corresponding communicative act sepa- 
rately to all designated recipients. 
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The msgi message place initially contains multiple tokens. Each of these tokens rep- 
resents the msgi communicative act addressed to a different designated receiver on the 
broadcast list. In Figure 6-b, the initialization expression, corresponding to the msgi mes- 
sage place, has been omitted. The SiR\ place token and the appropriate msgi place token 
together enable the corresponding transition. Consequently, the transition may fire and thus 
the msgi communicative act transmission is simulated. 

The msgi communicative act is sent incrementally to every designated receiver on the 
broadcast list. The incoming arc expression ((s,r),i) is incremented by the transition to 
the outgoing ({s,TARGET(i + + I) arc expression, causing the receiver agent with 
index i + I on the broadcast list to be selected. The transition guard constraint i < size, 
i.e. i < {broadcast list\, ensures that the msgi message is sent no more than {broadcast list\ 
times. The msgi communicative act causes the agents to transition to the S2R2 place. 
This place represents a joint interaction state in which sender has already sent the msgi 
communicative act to receiver and is now waiting to receive the msg2 message (S2) and 
receiver has received the msgi message and is ready to send the msg2 communicative act 
to sender (i?2)- Finally, the msg2 message causes the agents to transition to the S3-R3 
place. The S3R3 place denotes a joint interaction state where sender has received the msg2 
communicative act from receiver and terminated (S3), while receiver has already sent the 
msg2 message to sender and terminated as well (-R3). 

We use Figure 6-b to demonstrate the use of token color to represent multiple concurrent 
conversations using the same CP-net. For instance, let us assume that the sender agent is 
called agenti and its broadcast list contains the following agents: agent2, agents, agents, 
agents and agent§. We will also assume that the agenti has already sent the msgi com- 
municative act to all agents on the broadcast list. However, it has only received the msg2 
reply message from agents an d agent§. Thus, the CP- net current marking for the complete 
interaction protocol is described as follows: the S2R2 place is marked by (agent2, agenti), 
(agents, agenti), (agents, agenti) , while the S3R3 place contains the tokens {agenti, agents) 
and (agenti, agents). 

An Example. We now construct a CP-net representation of the FIPA Query Interaction 
Protocol (FIPA Specifications, 2003d), shown in AUML form in Figure 7, to demonstrate 
how the building blocks presented in Sections 3 and 4 can be put together. In this interaction 
protocol, the Initiator requests the Participant to perform an inform action using one of two 
query communicative acts, query-if or query-ref. The Participant processes the query and 
makes a decision whether to accept or refuse the query request. The Initiator may request 
the Participant to respond with either an accept or refuse message, and for simplicity, 
we will assume that this is always the case. In case the query request has been accepted, 
the Participant informs the Initiator on the query results. If the Participant fails, then 
it communicates a failure. In a successful response, the Participant replies with one of 
two versions of inform (inform-t/f or inform-result) depending on the type of initial query 
request. 

The CP-net representation of the FIPA Query Interaction Protocol is presented in Fig- 
ure 8. The interaction starts in the I1P1 place (we use the / and the P capital letters 
to denote the Initiator and the Participant roles). The I1P1 place represents a joint 
interaction state where (i) the Initiator agent is ready to send either the query-if commu- 
nicative act, or the query-ref message, to Participant (Ii); and (ii) Participant is wait- 



362 



Representing Conversations for Scalable Overhearing 



FIPA-Query-Protocol 



7 
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inform-result:inform 



-fr- 



[agrfced] 



[query-ref] 



Figure 7: FIPA Query Interaction Protocol - AUML representation. 



ing to receive the corresponding message (Pi). The Initiator can send either a query-if 
or a query-ref communicative act. We assume that these acts belong to the same class, 
the query communicative act class. Thus, we implement both messages using a single 
Query message place, and check the message type using the following transition guard: 
[#t msg = query-if or #t msg = query-ref\. The query communicative act causes the 
interacting agents to transition to the I2P2 place. This place represents a joint interaction 
state in which Initiator has sent the query communicative act and is waiting to receive 
a response message (I2), an d Participant has received the query communicative act and 
deciding whether to send an agree or a refuse response message to Initiator (P2). The 
refuse communicative act causes the agents to transition to /3P3 place, while the agree 
message causes the agents to transition to /4P4 place. 

The Participant decision on whether to send an agree or a refuse communicative 
act is represented using the XOR-decision building block introduced earlier (Figure 3-b). 
The I3P3 place represents a joint interaction state where Initiator has received a refuse 
communicative act and terminated (I3) and Participant has sent a refuse message and 
terminated as well (P3). The /4P4 place represents a joint interaction state in which Initiator 
has received an agree communicative act and is now waiting for further response from 
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Figure 8: FIPA Query Interaction Protocol - CP-net representation. 



Participant (I4) and Participant has sent an agree message and is now deciding which 
response to send to Initiator {Pa). At this point, the Participant agent may send one 
of the following communicative acts: inform-t/f, inform-result and failure. The choice is 
represented using another XOR-decision building block, where the inform-t/f and inform- 
result communicative acts are represented using a single Inform message place. The failure 
communicative act causes a transition to the /5P5 place, while the inform message causes 
a transition to the IqPq place. The /5-P5 place represents a joint interaction state where 
Participant has sent a failure message and terminated (-P5), while Initiator has received 
a failure and terminated (Is). The IqPq place represents a joint interaction state in which 
Participant has sent an inform message and terminated (Pq), while Initiator has received 
an inform and terminated (Jg). 

The implementation of the [query-if] and the [query-ref] message guard conditions re- 
quires a detailed discussion. These are not implemented in a usual manner in view of the fact 
that they depend on the original request communicative act. Thus, we create a special in- 
termediate place that contains the original message type marked "Original Message Type" 
in the figure. In case an inform communicative act is sent, the transition guard verifies 
that the inform message is appropriate to the original query type. Thus, an inform-t/f 
communicative act can be sent only if the original query type has been query-if and an 
inform-result message can be sent only if the original query type has been query-ref. 
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5. Representing Nested & Interleaved Interactions 

In this section, we extend the CP-net representation of previous sections to model nested 
and interleaved interaction protocols. We focus here on nested interaction protocols. Never- 
theless, the discussion can also be addressed to interleaved interaction protocols in a similar 
fashion. 

FIPA conversation standards (FIPA Specifications, 2003c) emphasize the importance of 
nested and interleaved protocols in modelling complex interactions. First, this allows re- 
use of interaction protocols in different nested interactions. Second, nesting increases the 
readability of interaction protocols. 

The AUML notation annotates nested and interleaved protocols as round corner rect- 
angles (Odell et al., 2001a; FIPA Specifications, 2003c). Figure 9-a shows an example of 
a nested protocol 5 , while Figure 9-b illustrates an interleaved protocol. Nested protocols 
have one or more compartments. The first compartment is the name compartment. The 
name compartment holds the (optional) name of the nested protocol. The nested protocol 
name is written in the upper left-hand corner of the rectangle, i.e. commitment in Figure 9- 
a. The second compartment, the guard compartment, holds the (optional) nested protocol 
guard. The guard compartment is written in the lower left-hand corner of the rectangle, e.g. 
[commit] in Figure 9-a. Nested protocols without guards are equivalent to nested protocols 
with the [trite] guard. 
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Figure 9: AUML nested and interleaved protocols examples. 

Figure 10 describes the implementation of the nested interaction protocol presented in 
Figure 9-a by extending the CP-net representation to using hierarchies, relying on stan- 
dard CP-net methods (see Appendix A). The hierarchical CP-net representation contains 
three elements: a superpage, a subpage and a page hierarchy graph. The CP-net superpage 
represents the main interaction protocol containing a nested interaction, while the CP-net 
subpage models the corresponding nested interaction protocol, i.e. the Commitment Inter- 

5. Figure 9-a appears in FIPA conversation standards (FIPA Specifications, 2003c). Nonetheless, note that 
the request-good and the request-pay communicative acts are not part of the FIPA-ACL standards. 
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action Protocol. The page hierarchy graph describes how the superpage is decomposed into 
subpages. 
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Figure 10: Nested protocol implementation using hierarchical CP-nets. 

Let us consider in detail the process of modelling the nested interaction protocol in 
Figure 9-a using a hierarchical CP-net, resulting in the net described in Figure 10. First, we 
identify the starting and ending points of the nested interaction protocol. The starting point 
of the nested interaction protocol is where Buyer i sends a Request- Good communicative act 
to Seller\. The ending point is where Buyer\ receives a Request-Pay communicative act 
from Seller\. We model these nested protocol end-points as CP- net socket nodes on the 
superpage, i.e. Main Interaction Protocol: BuSn and Request-Good are input socket 
nodes and -B13S13 is an output socket node. 

The nested interaction protocol, the Commitment Interaction Protocol, is represented 
using a separate CP-net, following the principles outlined in Sections 3 and 4. This net 
is a subpage of the main interaction protocol superpage. The nested interaction protocol 
starting and ending points on the subpage correspond to the net port nodes. The B\S\ and 
Request- Good places are the subpage input port nodes, while the B3S3 place is an output 
port node. These nodes are tagged with the IN/OUT port type tags correspondingly. 

Then, a substitution transition, which is denoted using HS (Hierarchy and Substitu- 
tion), connects the corresponding socket places on the superpage. The substitution tran- 
sition conceals the nested interaction protocol implementation from the net superpage, i.e. 
the Main Interaction Protocol. The nested protocol name and guard compartments are 
mapped directly to the substitution transition name and guard respectively. Consequently, 
in Figure 10 we define the substitution transition name as Commitment and the substitution 
guard is determined to be [commit]. 

The superpage and subpage interface is provided using the hierarchy inscription. The 
hierarchy inscription is indicated using the dashed box next to the substitution transi- 
tion. The first line in the hierarchy inscription determines the subpage identity, i.e. the 
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Commitment Interaction Protocol in our example. Moreover, it indicates that the substi- 
tution transition replaces the corresponding subpage detailed implementation on the super- 
page. The remaining hierarchy inscription lines introduce the superpage and subpage port 
assignment. The port assignment relates a socket node on the superpage with a port node 
on the subpage. The substitution transition input socket nodes are related to the IN-tagged 
port nodes. Analogously, the substitution transition output socket nodes correspond to the 
OUT-tagged port nodes. Therefore, the port assignment in Figure 10 assigns the net socket 
and port nodes in the following fashion: BuSu to B\S\, Request- Good to Request- Good 
and B 13 S±3 to B 3 S 3 . 

Finally, the page hierarchy graph describes the decomposition hierarchy (nesting) of 
the different protocols (pages). The CP-net pages, the Main Interaction Protocol and 
the Commitment Interaction Protocol, correspond to the page hierarchy graph nodes 
(Figure 10). The arc inscription indicates the substitution transition, i.e. Commitment. 

6. Representing Temporal Aspects of Interactions 

Two temporal interaction aspects are specified by FIPA (FIPA Specifications, 2003c). In 
this section, we show how timed CP-nets (see also Appendix A) can be applied for modelling 
agent interactions that involve temporal aspects, such as interaction duration, deadlines for 
message exchange, etc. 

A first aspect, duration, is the interaction activity time period. Two periods can be 
distinguished: transmission time and response time. The transmission time indicates the 
time interval during which a communicative act, is sent by one agent and received by the 
designated receiver agent. The response time period denotes the time interval in which 
the corresponding receiver agent is performing some task as a response to the incoming 
communicative act. 

The second temporal aspect is deadlines. Deadlines denote the time limit by which 
a communicative act must be sent. Otherwise, the corresponding communicative act is 
considered to be invalid. These issues have not been addressed in previous investigations 
related to agent interactions modelling using Petri nets. 6 

We propose to utilize timed CP-nets techniques to represent these temporal aspects of 
agent interactions. In doing so, we assume a global clock. 7 We begin with deadlines. Fig- 
ure 11-a introduces the AUML representation of message deadlines. The deadline keyword 
is a variation of the communicative act sequence expressions described in Section 4. It 
sets a time constraint on the start of the transmission of the associated communicative act. 
In Figure 11-a, agent\ must send the msg communicative act to agents before the defined 
deadline. Once the deadline expires, the msg communicative act is considered to be invalid. 

Figure 11-b shows a timed CP- net implementation of the deadline sequence expression. 
The timed CP-net in Figure 11-b defines an additional MSG-TIME color set associated with 
the net message places. The MSG-TIME color set extends the MSG color set, described in 
Section 4, by adding a time stamp attribute to the message token. Thus, the communicative 

6. Cost et al. (1999, 2000) mention deadlines without presenting any implementation details. 

7. Implementing it, we can use the private clock of an overhearing agent as the global clock for our Petri 
net representation. Thus, the time stamp of the message is the overhearer's time when the corresponding 
message was overheard. 
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Figure 11: Deadline sequence expression. 



act token is a record (s,r,t,c)@[Tts]. The ©[■■] expression denotes the corresponding token 
time stamp, whereas the token time value is indicated starting with a capital 'T'. Accord- 
ingly, the described message token has a ts time stamp. The communicative act time limit 
is defined using the val deadline parameter. Therefore, the deadline sequence expression 
semantics is simulated using the following transition guard: [Tts < Tdeadline]. This tran- 
sition guard, comparing the msg time stamp against the deadline parameter, guarantees 
that an expired msg communicative act can not be received. 

We now turn to representing interaction duration. The AUML representation is shown in 
Figure 12-a. The AUML time intensive message notation is used to denote the communica- 
tive act transmission time. As a rule communicative act arrows are illustrated horizontally. 
This indicates that the message transmission time can be neglected. However, in case the 
message transmission time is significant, the communicative act is drawn slanted downwards. 
The vertical distance, between the arrowhead and the arrow tail, denotes the message trans- 
mission time. Thus, the communicative act msg\, sent from agenti to agent2, has a t\ 
transmission time. 
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(a) AUML representation 
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Figure 12: Interaction duration. 
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The response time in Figure 12-a is indicated through the interaction thread length. 
The incoming msgi communicative act causes agents to perform some task before sending 
a response msgi message. The corresponding interaction thread duration is denoted through 
the ti time period. Thus, this time period specifies the agent2 response time to the incoming 
msgx communicative act. 

The CP-net implementation to the interaction duration time periods is shown in Fig- 
ure 12-b. The communicative act transmission time is illustrated using the timed CP-nets 
@+ operator. The net transitions simulate the communicative act transmission between 
agents. Therefore, representing a transmission time of t±, the CP-net transition adds a ti 
time period to the incoming message token time stamp. Accordingly, the transition @ + Tt\ 
output arc expression denotes a t\ delay to the time stamp of the outgoing token. Thus, 
the corresponding transition takes t\ time units and consequently so does the msgi commu- 
nicative act transmission time. 

In contrast to communicative act transmission time, the agent interaction response time 
is represented implicitly. Previously, we have defined a MSG-TIME color set that indicates 
message token time stamps. Analogously, in Figure 12-b we introduce an additional INTER- 
STATE-TIME color set. This color set is associated with the net agent places and it presents 
the possibility to attach time stamps to agent tokens as well. Now, let us assume that A2B2 
and msg2 places contain a single token each. The circled T next to the corresponding place, 
together with the multi-set inscription, indicates the place current marking. Thus, the agent 
and the message place tokens have a tsi and a ts2 time stamps respectively. The ts\ time 
stamp denotes the time by which agent2 has received the msgi communicative act sent 
by agent\. The ts2 time stamp indicates the time by which agent2 is ready to send msg2 
response message to agent\. Thus, the agent2 response time ti (Figure 12-a) is ts2 — tsi. 

7. Algorithm and a Concluding Example 

Our final contribution in this paper is a skeleton procedure for transforming an AUML 
conversation protocol diagram of two interacting agents to its CP-net representation. The 
procedure is semi-automated-it relies on the human to fill in some details-but also has 
automated aspects. We apply this procedure on a complex multi-agent conversation protocol 
that involves many of the interaction building blocks already discussed. 

The procedure is shown in Algorithm 1. The algorithm input is an AUML protocol 
diagram and the algorithm creates, as an output, a corresponding CP-net representation. 
The CP-net is constructed in iterations using a queue. The algorithm essentially creates the 
conversation net by exploring the interaction protocol breadth-first while avoiding cycles. 

Lines 1-2 create and initiate the algorithm queue, and the output CP-net, respectively. 
The queue, denoted by S, holds the initiating agent places of the current iteration. These 
places correspond to interaction states that initiate further conversation between the in- 
teracting agents. In lines 4-5, an initial agent place A\B\ is created and inserted into the 
queue. The A\B\ place represents a joint initial interaction state for the two agents. Lines 
7-23 contain the main loop. 

We enter the main loop in line 8 and set the curr variable to the first initiating agent 
place in S queue. Lines 10-13 create the CP-net components corresponding to the current 
iteration as follows. First, in line 10, message places, associated with curr agent place, are 
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Algorithm 1 Create Conversation Net(input:A[/AfL,output:CPA^) 

1: S <— new queue 

2: CPN <r- new CP - net 

3: 

4: A\B\ <— new agent place with color information 

5: S.enqueue(AiBi) 

6: 

7: while S not empty do 

8: curr <— S.dequeue() 
9: 

10: MPf- CreateMessagePlaces(AU ML, curr) 

11: i?P <— CreateResultingAgentPlaces(AUML, curr, MP) 

12: (Ti?, ,4/2) CreateTransitionsAndArcs(AUML, curr, MP, RP) 

13: FixColor(AUML, CPN, MP, RP, TR, AR) 

14: 

15: for each pZace p in i?P do 

16: if p was not created in current iteration then 

17: continue 

18: if p is not terminating place then 

19: S.enqueue(p) 

20: 

21: CPN. places = CPN. places (J MP (J RP 

22: CP. transitions = CPN.transitions\\TR 

23: CPN. arcs = CPN. arcs \J AR 

24: 

25: return CPiV 



created using the CreateMessagePlaces procedure (which we do not detail here). This 
procedure extracts the communicative acts that are associated with a given interaction 
state, from the AUML diagram. These places correspond to communicative acts, which 
take agents from the joint interaction state curr to its successor (s). Then in line 11, the 
CreateResultingAgentPlaces procedure creates agent places that correspond to interaction 
state changes as a result of the communicative acts associated with curr agent place (again 
based on the AUML diagram). Then, in CreateTransitions And Arcs procedure (line 12), 
these places are connected using the principles described in Sections 3-6. Thus, the CP-net 
structure (net places, transitions and arcs) is created. Finally, in line 13, the FixColor pro- 
cedure adds token color elements to the CP-net structure, to support deadlines, cardinality, 
and other communicative act attributes. 

Lines 15-19 determine which resulting agent places are inserted into the S queue for 
further iteration. Only non-terminating agent places, i.e. places that do not correspond to 
interaction states that terminate the interaction, are inserted into the queue in lines 18-19. 
However, there is one exception (lines 16-17): a resulting agent place, which has already been 
handled by the algorithm, is not inserted back into the S queue since inserting it can cause 
an infinite loop. Thereafter, completing the current iteration, the output CP-net, denoted 
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by CPN variable, is updated according to the current iteration CP-net components in lines 
21-23. This main loop iterates as long as the S queue is not empty. The resulting CP-net is 
returned-line 25. 
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Figure 13: FIPA Contract Net Interaction Protocol using AUML. 

To demonstrate this algorithm, we will now use it on the FIPA Contract Net Interaction 
Protocol (FIPA Specifications, 2003b) (Figure 13). This protocol allows interacting agents to 
negotiate. The Initiator agent issues m calls for proposals using a cfp communicative act. 
Each of the m Participants may refuse or counter-propose by a given deadline sending either 
a refuse or a propose message respectively. A refuse message terminates the interaction. 
In contrast, a propose message continues the corresponding interaction. 

Once the deadline expires, the Initiator does not accept any further Participant re- 
sponse messages. It evaluates the received Participant proposals and selects one, several, 
or no agents to perform the requested task. Accepted proposal result in the sending of 
accept-proposal messages, while the remaining proposals are rejected using reject-proposal 
message. Reject-proposal terminates the interaction with the corresponding Participant. 
On the other hand, the accept-proposal message commits a Participant to perform the re- 
quested task. On successful completion, Participant informs Initiator sending either an 
inform-done or an inform-result communicative act. However, in case a Participant has 
failed to accomplish the task, it communicates a failure message. 
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We now use the algorithm introduced above to create a CP-net, which represents the 
FIPA Contract Net Interaction Protocol. The corresponding CP-net model is constructed in 
four iterations of the algorithm. Figure 14 shows the CP-net representation after the second 
iteration of the algorithm, while Figure 15 shows the CP-net representation after the fourth 
and final iteration. 

The Contract Net Interaction Protocol starts from I\P\ place, which represents a joint in- 
teraction state where Initiator is ready to send a cfp communicative act (I\) and Participant 
is waiting for the corresponding cfp message (Pi). The I\P\ place is created and inserted 
into the queue before the iterations through the main loop begin. 

First iteration. The curr variable is set to the IiPi place. The algorithm creates 
net places, which are associated with the I\P\ place, i.e. a Cfp message place, and an 
I2P2 resulting agent place. The I2P2 place denotes an interaction state in which Initiator 
has already sent a cfp communicative act to Participant and is now waiting for its re- 
sponse (I2) and Participant has received the cfp message and is now deciding on an 
appropriate response (i^)- These are created using the CreateMessagePlaces and the 
CreateResultingAgentPlaces procedures, respectively. 

Then, the CreateTransitions And Arcs procedure in line 12, connects the three places 
using a simple asynchronous message building block as shown in Figure 1-b (Section 3). 
In line 13, as the color sets of the places are determined, the algorithm also handles the 
cardinality of the cfp communicative act, by putting an appropriate sequence expression on 
the transition, using the principles presented in Figure 6-b (Section 4). Accordingly, the 
color set, associated with I\P\ place, is changed to the INTER-STATE-CARD color set. 
Since the I2P2 place is not a terminating place, it is inserted into the S queue. 

Second iteration, curr is set to the I2P2 place. The Participant agent can send, as a 
response, either a refuse or a propose communicative act. Refuse and Propose message 
places are created by CreateMessagePlaces (line 10), and resulting places I3P3 and I4P4, 
corresponding to the results of the refuse and propose communicative acts, respectively, 
are created by CreateResultingAgentPlaces (line 11). The /3-P3 place represents a joint 
interaction state where Participant has sent the refuse message and terminated (-P3), while 
Initiator has received it, and terminated (Ts). The I4P4 place represents the joint state in 
which Participant has sent the propose message (P4), while Initiator has received the 
message and is considering its response (Z4). 

In line 12, the I2P2, Refuse, I3P3, Propose and /4P4 places are connected using the 
XOR-decision building block presented in Figure 3-b (Section 3). Then, the FixColor 
procedure (line 13), adds the appropriate token color attributes, to allow a deadline sequence 
expression (on both the refuse and the propose messages) to be implemented as shown in 
Figure 11-b (Section 6). The /3P3 place denotes a terminating state, whereas the I4P4 
place continues the interaction. Thus, in lines 18-19, only the I4P4 place is inserted into the 
queue, for the next iteration of the algorithm. The state of the net at the end of the second 
iteration of the algorithm is presented in Figure 14. 

Third iteration, curr is set to I4P4. Here, the Initiator response to a Participant 
proposal can either be an accept-proposal or a reject-proposal. CreateMessagePlaces proce- 
dure in line 10 thus creates the corresponding Accept-Proposal and Reject-Proposal message 
places. The accept-proposal and reject-proposal messages cause the interacting agents to 
transition to -Z5-P5 and 16-^6 places, respectively. These agent places are created using the 
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color AGENT = 

color TYPE = cfp|refuse|...; 

color CONTENT = 

color INTER-STATE = record a r AGENT* 

a 2 :AGENT; 

color CARD = int; 

color INTER-STATE-CARD = product 

I NTE R-STATE*C ARD ; 

color MSG = record s:AGENT*r:AGENT* 
t:TYPE*c:CONTENT; 

color TARGET-LIST = AGENT with.. . ; 

val m = ... ; 

color TARGET = index TARGET-LIST 

with 0...m-1 ; 
vars,r:AGENT; var msg:MSG; 
var i:CARD; 
val deadline=... ; 



Figure 14: FIPA Contract Net Interaction Protocol using CP-net after the 2 nd iteration. 

CreateResultingAgentPlaces procedure (line 11). The I5P5 place denotes an interaction 
state in which Initiator has sent a reject-proposal message and terminated the interac- 
tion (I5), while the Participant has received the message and terminated as well (-P5). In 
contrast, the IqPq place represents an interaction state where Initiator has sent an accept- 
proposal message and is waiting for a response (Iq), while Participant has received the 
accept-proposal communicative act and is now performing the requested task before sending 
a response (Pq). The Initiator agent sends exclusively either an accept-proposal or a reject- 
proposal message. Thus, the /4-P4, Reject-Proposal, /5P5, Accept-Proposal and IqPq places 
are connected using a XOR-decision block (in the CreateTransitions And Arcs procedure, 
line 12). 

The FixColor procedure in line 13 operates now as follows: According to the interaction 
protocol semantics, the Initiator agent evaluates all the received Participant proposals once 
the deadline passes. Only thereafter, the appropriate reject-proposal and accept-proposal 
communicative acts are sent. Thus, FixColor assigns a MSG-TIME color set to the Reject- 
Proposal and the Accept-Proposal message places, and creates a [Tts >= Tdeadline] tran- 
sition guard on the associated transitions. This transition guard guarantees that Initiator 
cannot send any response until the deadline expires, and all valid Participant responses 
have been received. The resulting I5P5 agent place denotes a terminating interaction state, 
whereas the IqPq agent place continues the interaction. Thus, only 16-^6 agent place is 
inserted into the S queue. 

Fourth iteration, curr is set to IqPq. This place is associated with three commu- 
nicative acts: inform-done, inform-result and failure. The inform-done and the inform- 
result messages are instances of the inform communicative act class. Thus, CreateMes- 
sagePlaces (line 10) creates only two message places, Inform and Failure. In line 11, 
CreateResultingAgentPlaces creates the J7P7 and IgPs agent places. The failure com- 
municative act causes interacting agents to transition to I7P7 agent place, while both inform 
messages cause the agents to transition to IgPs agent place. The I7P7 place represents a 
joint interaction state where Participant has sent the failure message and terminated (-P7), 
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while Initiator has received a failure communicative act and terminated (Ij). On the other 
hand, the IgPs place denotes an interaction state in which Participant has sent the inform 
message (either inform- done or inform-result) and terminated (Ps), while Initiator has 
received an inform communicative act and terminated (Is)- The inform and failure com- 
municative acts are sent exclusively. Thus CreateTransitions And Arcs (line 12) connects 
the IqPq, Failure, I7P7, Inform and IsPs places using a XOR-decision building block. 
Then, FixColor assigns a [fit msg = inform-done or #t msg = inform-result] transition 
guard on the transition associated with Inform message place. Since both the I7P7 and 
the IgPs agent places represent terminating interaction states, they are not inserted into the 
queue, which remains empty at the end of the current iteration. This signifies the end of the 
conversion. The complete conversation CP-net resulting after this iteration of the algorithm 
is shown in Figure 15. 
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Figure 15: FIPA Contract Net Interaction Protocol using CP-net after the 4 (and final) 
iteration. 



The procedure we outline can guide the conversion of many 2- agent conversation pro- 
tocols in AUML to their CP-net equivalents. However, it is not sufficiently developed to 
address the general n-agent case. Appendix C presents a complex example of a 3-agent con- 
versation protocol, which was successfully converted manually, without the guidance of the 
algorithm. This example incorporates many advanced features of our CP-net representation 
technique and would have been beyond the scope of many previous investigations. 
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8. Summary and Conclusions 

Over recent years, open distributed MAS applications have gained broad acceptance both 
in the multi-agent academic community and in real- world industry. As a result, increas- 
ing attention has been directed to multi-agent conversation representation techniques. In 
particular, Petri nets have recently been shown to provide a viable representation approach 
(Cost et al, 1999, 2000; Nowostawski et al, 2001; Mazouzi et al., 2002). 

However, radically different approaches have been proposed to using Petri nets for mod- 
elling multi- agent conversations. Yet, the relative strengths and weaknesses of the proposed 
techniques have not been examined. Our work introduces a novel classification of previ- 
ous investigations and then compares these investigations addressing their scalability and 
appropriateness for overhearing tasks. 

Based on the insights gained from the analysis, we have developed a novel representation, 
that uses CP-nets in which places explicitly represent joint interaction states and messages. 
This representation technique offers significant improvements (compared to previous ap- 
proaches) in terms of scalability, and is particularly suitable for monitoring via overhearing. 
We systematically show how this representation covers essentially all the features required 
to model complex multi- agent conversations, as defined by the FIPA conversation stan- 
dards (FIPA Specifications, 2003c). These include simple & complex interaction building 
blocks (Section 3 & Appendix B), communicative act attributes and multiple concurrent 
conversations using the same CP-net (Section 4), nested & interleaved interactions using 
hierarchical CP-nets (Section 5) and temporal interaction attributes using timed CP-nets 
(Section 6). The developed techniques have been demonstrated, throughout the paper, on 
complex interaction protocols defined in the FIPA conversation standards (see in particular 
the example presented in Appendix C). Previous approaches could handle some of these 
examples (though with reduced scalability), but only a few were shown to cover all the 
required features. 

Finally, the paper presented a skeleton procedure for semi-automatically converting an 
AUML protocol diagrams (the chosen FIPA representation standard) to an equivalent CP- 
net representation. We have demonstrated its use on a challenging FIPA conversation pro- 
tocol, which was difficult to represent using previous approaches. 

We believe that this work can assist and motivate continuing research on multi-agent 
conversations including such issues as performance analysis, validation and verification (De- 
sel et al., 1997), agent conversation visualization, automated monitoring (Kaminka et al., 
2002; Busetta et al., 2001, 2002), deadlock detection (Khomenco & Koutny, 2000), debug- 
ging (Poutakidis et al., 2002) and dynamic interpretation of interaction protocols (Cranefield 
et al., 2002; de Silva et al., 2003). Naturally, some issues remain open for future work. For 
example, the presented procedure addresses only AUML protocol diagrams representing two 
agent roles. We plan to investigate an n-agent version in the future. 
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Appendix A. A Brief Introduction to Petri Nets 

Petri nets (Petri Nets site, 2003) are a widespread, established methodology for representing 
and reasoning about distributed systems, combining a graphical representation with a com- 
prehensive mathematical theory. One version of Petri nets is called Place/Transition nets 
(PT-nets) (Reisig, 1985). A PT-net is a bipartite directed graph where each node is either 
a place or a transition (Figure 16). The net places and transitions are indicated through 
circles and rectangles respectively. The PT-net arcs support only place — > transition and 
transition — > place connections, but never connections between two places or between two 
transitions. The arc direction determines the input/output characteristics of the place and 
the transition connected. Thus, given an arc, P — > T, connecting place P and transition T, 
we will say that place P is an input place of transition T and vice versa transition T is an 
output transition of place P. The P — > T arc is considered to be an output arc of place P 
and an input arc of transition T. 



P Q P Q 




R S R S 

(a) Before firing (b) After firing 



Figure 16: A PT-net example. 

A PT-net place may be marked by small black dots called tokens. The arc expression is 
an integer, which determines the number of tokens associated with the corresponding arc. 
By convention, an arc expression equal to 1 is omitted. A specific transition is enabled if 
and only if its input places marking satisfies the appropriate arc expressions. For example, 
consider arc P — > T to be the only arc to connect place P and transition T. Thus, given 
that this arc has an arc expression 2, we will say that transition T is enabled if and only 
if place P is marked with two tokens. In case the transition is enabled, it may fire/occur. 
The transition occurrence removes tokens from the transition input places and puts tokens 
to the transition output places as specified by the arc expressions of the corresponding 
input/output arcs. Thus, in Figures 16-a and 16-b, we demonstrate PT-net marking before 
and after transition firing respectively. 

Although computationally equivalent, a different version of Petri nets, called Colored 
Petri nets (CP-nets) (Jensen, 1997a, 1997b, 1997c), offers greater flexibility in compactly 
representing complex systems. Similarly to the PT-net model, CP-nets consist of net places, 
net transitions and arcs connecting them. However, in CP-nets, tokens are not just single 
bits, but can be complex, structured, information carriers. The type of additional informa- 
tion carried by the token, is called token color, and it may be simple (e.g., an integer or a 
string), or complex (e.g. a record or a tuple). Each place is declared by a place color set to 
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only match tokens of particular colors. A CP-net place marking is a token multi-set (i.e., a 
set in which a member may appear more than once) corresponding to the appropriate place 
color set. CP-net arcs pass token multi-sets between the places and transitions. CP-net arc 
expressions can evaluate token multi-sets and may involve complex calculation procedures 
over token variables declared to be associated with the corresponding arcs. 

The CP-net model introduces additional extensions to PT-nets. Transition guards are 
boolean expressions, which constrain transition firings. A transition guard associated with 
a transition tests tokens that pass through a transition, and will only enable the transition 
firings if the guard is successfully matched (i.e., the test evaluates to true). The CP-net 
transition guards, together with places color sets and arc expressions, appear as a part of 
net inscriptions in the CP-net. 

In order to visualize and manage the complexity of large CP-nets, hierarchical CP-nets 
(Huber, Jensen, & Shapiro, 1991; Jensen, 1997a) allow hierarchical representations of CP- 
nets, in which sub-CP nets can be re-used in higher-level CP nets, or abstracted away from 
them. Hierarchical CP-nets are built from pages, which are themselves CP-nets. Superpages 
present a higher level of hierarchy, and are CP-nets that refer to subpages, in addition to 
transitions and places. A subpage may also function as a superpage to other subpages. This 
way, multiple hierarchy levels can be used in a hierarchical CP-net structure. 

The relationship between a superpage and a subpage is defined by a substitution transi- 
tion, which substitutes a corresponding subpage instance on the CP-net superpage structure 
as a transition in the superpage. The substitution transition hierarchy inscription supplies 
the exact mapping of the superpage places connected to the substitution transition (called 
socket nodes), to the subpage places (called port nodes). The port types determine the 
characteristics of the socket node to port node mappings. A complete CP-net hierarchical 
structure is presented using a page hierarchy graph, a directed graph where vertices corre- 
spond to pages, and directed edges correspond to direct superpage-subpage relationships. 

Timed CP-nets (Jensen, 1997b) extend CP-nets to support the representation of tem- 
poral aspects using a global clock. Timed CP-net tokens have an additional color attribute 
called time stamp, which refers to the earliest time at which the token may be used. Time 
stamps can be used by arc expression and transition guards, to enable a timed-transition if 
and only if it satisfies two conditions: (i) the transition is color enabled, i.e. it satisfies the 
constraints defined by arc expression and transition guards; and (ii) the tokens are ready, 
i.e. the time of the global clock is equal to or greater than the tokens' time stamps. Only 
then can the transition fire. 

Appendix B. Additional Examples of Conversation Representation 
Building Blocks 

This appendix presents some additional interaction building blocks to those already de- 
scribed in Section 3. The first is the AND-parallel messages interaction (AUML represen- 
tation shown in Figure 17-a). Here, the sender agent\ sends both the msg\ message to 
agent2 and the msgi message to agent^. However, the order of the two communicative acts 
is unconstrained. 

The representation of AND-parallel in our CP-net representation is shown in Figure 17-b. 
The A\B\C\, A2B2, A2C2, msg\ and msg2 places are defined similarly to Figures 3-b and 
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(a) AUML representation 
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Figure 17: AND-parallel messages interaction. 



4-b in Section 3. However, we also define two additional intermediate agent places, A^BiCi 
and A![B\Ci- The A^BiCi place represents a joint interaction state where agenti has sent 
the msgi message to agents and is ready to send the msgi communicative act to agent?, 
(-Ai')j agenti has received the msgi message (Bi) and agents is waiting to receive the msgi 
communicative act [C\). The A'{BiCi place represents a joint interaction state in which 
agenti is ready to send the msgi message to agenti and has already sent the msgi commu- 
nicative act to agents (A'{), agenti is waiting to receive the msgi message (Bi) and agents, 
has received the msgi communicative act (Ci). These places enable agenti to send both 
communicative acts concurrently. Four transitions connect the appropriate places respec- 
tively. The behavior of the transitions connecting A' x BiCi — s> A1B1 and A'[BiCi —¥ A1C1 
is similar to described above. The transitions A1B1C1 —¥ A^BiCi and A1B1C1 — > A'[BiCi 
are triggered by receiving messages msgi and msgi, respectively. However, these transi- 
tions should not consume the message token since it is used further for triggering transitions 
A^BiCi — > A1B1 and A'(BiCi — > A1C1. This is achieved by adding an appropriate message 
place as an output place of the corresponding transition. 

The second AUML interaction building block, shown in Figure 18-a, is the message 
sequence interaction, which is similar to AND-parallel. However, the message sequence 
interaction defines explicitly the order between the transmitted messages. Using the 1/msgi 
and 2/msgi notation, Figure 18-a specifies that the msgi message should be sent before 
sending msgi. 

Figure 18-b shows the corresponding CP-net representation. The A1B1C1, A1B1, A1C1, 
msgi and msgi places are defined as before. However, the CP-net implementation presents 
an additional intermediate agent place-^Z^Cl^wliich is identical to the corresponding 
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(b) CP-net representation 



Figure 18: Sequence messages interaction. 

intermediate agent place in Figure 17-b. A^-E^Ci is denned as an output place of the 
A\B\C\ A2B2 transition. It thus guarantees that the msgi communicative act can be 
sent (represented by the A' x BiC\ — > A2C2 transition) only upon completion of the msgi 
transmission (the A\B\Ci — > A2B2 transition). 

The last interaction we present is the synchronized messages interaction, shown in Fig- 
ure 19-a. Here, agents simultaneously receives msg\ from agent\ and msg2 from agent2- 
In AUML, this constraint is annotated by merging the two communicative act arrows into 
a horizontal bar with a single output arrow. 
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(b) CP-net representation 



Figure 19: Synchronized messages interaction. 
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Figure 19-b illustrates the CP-net implementation of synchronized messages interaction. 
As in previous examples, we define the A\C\, B\C\, msgi, msg2 and A2B2C2 places. We 
additionally define two intermediate agent places, A2C1 and B2C". The A2C1 place repre- 
sents a joint interaction state where agenti has sent msg\ to agents (A2), and agents has 
received it, however agents is also waiting to receive msg2 (C[). The B2C" place represents 
a joint interaction state in which agent2 has sent msg2 to agent?, (B2), and agents has 
received it, however agents is also waiting to receive msg\ {C"). These places guarantee 
that the interaction does not transition to the A2B2C2 state until both msg\ and msg2 have 
been received by agents . 

Appendix C. An Example of a Complex Interaction Protocol 

We present an example of a complex 3- agent conversation protocol, which was manually con- 
verted to a CP-net representation using the building blocks in this paper. The conversation 
protocol addressed here is the FIPA Brokering Interaction Protocol (FIPA Specifications, 
2003a). This interaction protocol incorporates many advanced conversation features of our 
representation such as nesting, communicative act sequence expression, message guards and 
etc. Its AUML representation is shown in Figure 20. 

The Initiator agent begins the interaction by sending a proxy message to the Broker 
agent. The proxy communicative act contains the requested proxied- communicative- act as 
part of its argument list. The Broker agent processes the request and responds with either an 
agree or a refuse message. Communication of a refuse message terminates the interaction. 
If the Broker agent has agreed to function as a proxy, it then locates the agents matching 
the Initiator request. If no such agent can be found, the Broker agent communicates 
a failure-no-match message and the interaction terminates. Otherwise, the Broker agent 
begins m interactions with the matching agents. For each such agent, the Broker informs the 
Initiator, sending either an inform- done-proxy or a failure-proxy communicative act. The 
failure-proxy communicative act terminates the sub-protocol interaction with the matching 
agent in question. The inform- done-proxy message continues the interaction. As the sub- 
protocol progresses, the Broker forwards the received responses to the Initiator agent using 
the reply-message- sub-protocol communicative acts. However, there can be other failures 
that are not explicitly returned from the sub-protocol interaction (e.g., if the agent executing 
the sub-protocol has failed). In case the Broker agent detects such a failure, it communicates 
a failure-brokering message, which terminates the sub-protocol interaction. 

A CP-net representation of the FIPA Brokering Interaction Protocol is shown in Fig- 
ure 21. The Brokering Interaction Protocol starts from I\B\ place. The I\B\ place rep- 
resents a joint interaction state where Initiator is ready to send a proxy communicative 
act (Ii) and Broker is waiting to receive it (Bi). The proxy communicative act causes the 
interacting agents to transition to I2B2. This place denotes an interaction state in which 
Initiator has already sent a proxy message to Broker (I2) and Broker has received it (-B2). 
The Broker agent can send, as a response, either a refuse or an agree communicative act. 
This CP-net component is implemented using the XOR-decision building block presented 
in Section 3. The refuse message causes the agents to transition to /3-B3 place and thus 
terminate the interaction. This place corresponds to Broker sending a refuse message 
and terminating (-B3), while Initiator receiving the message and terminating (I?)- On the 
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Figure 20: FIPA Brokering Interaction Protocol - AUML representation. 

other hand, the agree communicative act causes the agents to transition to place, 
which represents a joint interaction state in which the Broker has sent an agree message 
to Initiator (and is now trying to locate the receivers of the proxied message), while the 
Initiator received the agree message. 

The Broker agent's search for suitable receivers may result in two alternatives. First, 
in case no matching agents are found, the interaction terminates in the /5-B5 agent place. 
This joint interaction place corresponds to an interaction state where Broker has sent the 
failure-no-match communicative act (-B5), and Initiator has received the message and ter- 
minated (I5). The second alternative is that suitable agents have been found. Then, Broker 
starts sending proxied- communicative- act messages to these agents on the established list 
of designated receivers, i.e. TARGET-LIST. The first such proxied- communicative- act mes- 
sage causes the interacting agents to transition to I$BqP\ place. The I^BqP\ place denotes 
a joint interaction state of three agents: Initiator, Broker and Participant (the receiver). 
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Figure 21: FIPA Brokering Interaction Protocol - CP-net representation. 

The Initiator individual state remains unchanged (I4) since the proxied- communicative- act 
message starts an interaction between Broker and Participant. The Broker individual 
state (Bq) denotes that designated agents have been found and the proxied- communicative- 
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act messages are ready to be sent, while Participant is waiting to receive the interaction 
initiating communicative act (Pi). The proxied- communicative- act message place is also 
connected as an output place of the transition. This message place is used as part of a 
CP-net XOR-decision structure, which enables the Broker agent to send either a failure-no- 
match or a proxied- communicative- act, respectively. Thus, the token denoting the proxied- 
communicative-act message, must not be consumed by the transition. 

Thus, multiple proxied- communicative- act messages are sent to all Participants. This 
is implemented similarly to the broadcast sequence expression implementation (Section 4). 
Furthermore, the proxied- communicative- act type is verified against the type of the requested 
proxied communicative act, which is obtained from the original proxy message content. 
We use the Proxied- Communicative- Act-Type message type place to implement this CP- 
net component similarly to Figure 8. Each proxied- communicative- act message causes the 
interacting agents to transition to both the /4P7P1 and the BqP\ places. 

The BqP\ place corresponds to interaction between the Broker and the Participant 
agents. It represents a joint interaction state in which Broker is ready to send a proxied- 
communicative-act message to Participant (Bq), and Participant is waiting for the message 
(Pi). In fact, the BqP\ place initiates the nested interaction protocol that results in P10P3 
place. The P10P3 place represents a joint interaction state where Participant has sent 
the reply-message communicative act and terminated (P3), and Broker has received the 
message (Pio). In our example, we have chosen the FIPA Query Interaction Protocol (FIPA 
Specifications, 2003d) (Figures 7-8) as the interaction sub-protocol. The CP-net component, 
implementing the nested interaction sub-protocol, is modeled using the principles described 
in Section 5. Consequently, the interaction sub-protocol is concealed using the Query-Sub- 
Protocol substitution transition. The B§P\, proxied- communicative- act and P10P3 places 
determine substitution transition socket nodes. These socket nodes are assigned to the CP- 
net port nodes in Figure 8 as follows. The P^Pi an d proxied- communicative- act places are 
assigned to the iiPi and query input port nodes, while the P10P3 place is assigned to the 
/3P3, /5P5 and IqPq output port nodes. 

We now turn to the /4P7P1 place. In contrast to the P^Pi place, this place corresponds to 
the main interaction protocol. The /4P7P1 place represents a joint interaction state in which 
Initiator is waiting for Broker to respond (I4), Broker is ready to send an appropriate re- 
sponse communicative act (P7), and to the best of the Initiator's knowledge the interaction 
with Participant has not yet begun (Pi). The Broker agent can send one of two messages, 
either a failure-proxy or an inform- done-proxy, depending on whether it has succeeded to 
send the proxied- communicative- act message to Participant. The failure-proxy message 
causes the agents to terminate the interaction with corresponding Participant agent and to 
transition to IqB^Pi place. This place denotes a joint interaction state in which Initiator 
has received a failure-proxy communicative act and terminated (Iq), Broker has sent the 
failure-proxy message and terminated as well (Ps) an d the interaction with the Participant 
agent has never started (Pi). On the other hand, the inform- done-proxy causes the agents to 
transition to /7P9P2 place. The /7P9P2 place represents an interaction state where Broker 
has sent the inform- done-proxy message (P9), Initiator has received it (I7), and Participant 
has begun the interaction with the Broker agent (P2). Again, this is represented using the 
XOR-decision building block. 
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Finally, the Broker agent can either send a reply-message-sub-protocol or a failure- 
brokering communicative act. The failure-brokering message causes the interacting agents 
to transition to I%Bi\P2 place. This place indicates that Broker has sent & failure-brokering 
message and terminated (-B11), Initiator has received the message and terminated (Is), and 
Participant has terminated during the interaction with the Broker agent (i^)- The reply- 
message-sub-protocol communicative act causes the agents to transition to IgB^Ps place. 
The /9S12-P3 place indicates that Broker has sent a reply-message-sub-protocol message and 
terminated (-B12), Initiator has received the message and terminated (I9), and Participant 
has successfully completed the nested sub-protocol with the Broker agent and terminated as 
well (-P3). Thus, the -B10-P3 place, denoting a successful completion of the nested sub-protocol, 
is also the corresponding transition input place. 
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Abstract 



Open distributed multi-agent systems are gaining interest in the academic community 
| and in industry. In such open settings, agents are often coordinated using standardized 

■ agent conversation protocols. The representation of such protocols (for analysis, valida- 
\ tion, monitoring, etc) is an important aspect of multi-agent applications. Recently, Petri 

■ nets have been shown to be an interesting approach to such representation, and radically 
different approaches using Petri nets have been proposed. However, their relative strengths 
and weaknesses have not been examined. Moreover, their scalability and suitability for 
different tasks have not been addressed. This paper addresses both these challenges. First, 
we analyze existing Petri net representations in terms of their scalability and appropriate- 

| ness for overhearing, an important task in monitoring open multi-agent systems. Then, 

building on the insights gained, we introduce a novel representation using Colored Petri 
nets that explicitly represent legal joint conversation states and messages. This represen- 

■ tation approach offers significant improvements in scalability and is particularly suitable 
£N| ' for overhearing. Furthermore, we show that this new representation offers a comprehen- 

sive coverage of all conversation features of FIPA conversation standards. We also present 
a procedure for transforming AUML conversation protocol diagrams (a standard human- 
readable representation), to our Colored Petri net representation. 



1. Introduction 



Open distributed multi-agent systems (MAS) are composed of multiple, independently-built 
agents that carry out mutually-dependent tasks. In order to allow inter-operability of agents 
of different designs and implementation, the agents often coordinate using standardized in- 
teraction protocols, or conversations. Indeed, the multi-agent community has been investing 
a significant effort in developing standardized Agent Communication Languages (ACL) to 
facilitate sophisticated multi-agent systems (?, ?, ?, ?). Such standards define communica- 
tive acts, and on top of them, interaction protocols, ranging from simple queries as to the 
state of another agent, to complex negotiations by auctions or bidding on contracts. For 
instance, the FIPA Contract Net Interaction Protocol (?) defines a concrete set of message 
sequences that allows the interacting agents to use the contract net protocol for negotiations. 

Various formalisms have been proposed to describe such standards (e.g., ?, ?, ?, ?, ?). 
In particular, AUML-Agent Unified Modelling Language-is currently used in the FIPA- ACL 
standards (?, ?, ?, ?, ?) 0. UML 2.0 (?), a new emerging standard influenced by AUML, 
has the potential to become the FIPA- ACL standard (and a forthcoming IEEE standard) in 



1. (?) is currently deprecated. However, we use this specification since it describes many important features 
needed in modelling multi-agent interactions. 
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the future. However, for the moment, a large set of FIPA specifications remains formalized 
using AUML. While AUML is intended for human readability and visualization, interaction 
protocols should ideally be represented in a way that is amenable to automated analysis, 
validation and verification, online monitoring, etc. 

Lately, there is increasing interest in using Petri nets (?) in modelling multi- agent 
interaction protocols (?, ?, ?, ?, ?, ?, ?, ?, ?, ?). There is broad literature on using Petri nets 
to analyze the various aspects of distributed systems (e.g. in deadlock detection as shown 
by ?), and there has been recent work on specific uses of Petri nets in multi-agent systems, 
e.g., in validation and testing (?), in automated debugging and monitoring (?), in dynamic 
interpretation of interaction protocols (?, ?), in modelling agents behavior induced by their 
participation in a conversation (?) and in interaction protocols refinement allowing modular 
construction of complex conversations (?). 

However, key questions remain open on the use of Petri nets for conversation represen- 
tation. First, while radically different approaches to representation using Petri nets have 
been proposed, their relative strengths and weaknesses have not been investigated. Second, 
many investigations have only addressed restricted subsets of the features needed in repre- 
senting complex conversations such as those standardized by FIPA (see detailed discussion 
of previous work in Section [2]) . Finally, no procedures have been proposed for translating 
human-readable AUML protocol descriptions into the corresponding machine-readable Petri 
nets. 

This paper addresses these open challenges in the context of scalable overhearing. Here, 
an overhearing agent passively tracks many concurrent conversations involving multiple par- 
ticipants, based solely on their exchanged messages, while not being a participant in any 
of the overheard conversations itself (?, ?, ?, ?, ?, ?, ?, ?). Overhearing is useful in visual- 
ization and progress monitoring (?), in detecting failures in interactions (?), in maintaining 
organizational and situational awareness (?, ?, ?) and in non-obtrusively identifying oppor- 
tunities for offering assistance (?, ?). For instance, an overhearing agent may monitor the 
conversation of a contractor agent engaged in multiple contract-net protocols with different 
bidders and bid callers, in order to detect failures. 

We begin with an analysis of Petri net representations, with respect to scalability and 
overhearing. We classify representation choices along two dimensions affecting scalability: 
(i) the technique used to represent multiple concurrent conversations; and (ii) the choice 
of representing either individual or joint interaction states. We show that while the run- 
time complexity of monitoring conversations using different approaches is the same, choices 
along these two dimensions have significantly different space requirements, and thus some 
choices are more scalable (in the number of conversations) than others. We also argue that 
representations suitable for overhearing require the use of explicit message places, though 
only a subset of previously-explored techniques utilized those. 

Building on the insights gained, the paper presents a novel representation that uses 
Colored Petri nets (CP-nets) in which places explicitly denote messages, and valid joint 
conversation states. This representation is particularly suited for overhearing as the number 
of conversations is scaled-up. We show how this representation can be used to represent 
essentially all features of FIPA AUML conversation standards, including simple and com- 
plex interaction building blocks, communicative act attributes such as message guards and 
cardinalities, nesting, and temporal aspects such as deadlines and duration. 



350 



Representing Conversations for Scalable Overhearing 



To realize the advantages of machine-readable representations, such as for debugging 
(?), existing human-readable protocol descriptions must be converted to their corresponding 
Petri net representations. As a final contribution in this paper, we provide a skeleton semi- 
automated procedure for converting FIPA conversation protocols in AUML to Petri nets, 
and demonstrate its use on a complex FIPA protocol. While this procedure is not fully 
automated, it takes a first step towards addressing this open challenge. 

This paper is organized as follows. Section [2] presents the motivation for our work. 
Sections [3] through [6] then present the proposed representation addressing all FIPA conver- 
sation features including basic interaction building blocks (Section [3]), message attributes 
(Section [4j, nested & interleaved interactions (Section [5j), and temporal aspects (Section [6|). 
Section [7] ties these features together: It presents a skeleton algorithm for transforming an 
AUML protocol diagram to its Petri net representation, and demonstrates its use on a chal- 
lenging FIPA conversation protocol. Section [8] concludes. The paper rounds up with three 
appendixes. The first provides a quick review of Petri nets. Then, to complete coverage of 
FIPA interactions, Appendix[B] provides additional interaction building blocks. Appendix ICl 
presents a Petri net of a complex conversation protocol, which integrates many of the features 
of the developed representation technique. 

2. Representations for Scalable Overhearing 

Overhearing involves monitoring conversations as they progress, by tracking messages that 
are exchanged between participants (?). We are interested in representations that can facil- 
itate scalable overhearing, tracking many concurrent conversations, between many agents. 
We focus on open settings, where the complex internal state and control logic of agents is 
not known in advance, and therefore exclude discussions of Petri net representations which 
explicitly model agent internals (e.g., ?, ?). Instead, we treat agents as black boxes, and 
consider representations that commit only to the agent's conversation state (i.e., its role and 
progress in the conversation). 

The suitability of a representation for scalable overhearing is affected by several facets. 
First, since overhearing is based on tracking messages, the representation must be able to 
explicitly represent the passing of a message (communicative act) from one agent to another 
(Section I2.ip . Second, the representation must facilitate tracking of multiple concurrent 
conversations. While the tracking runtime is bounded from below by the number of messages 
(since in any case, all messages are overheard and processed), space requirements may differ 
significantly (see Sections 12.21 12.31) . 

2.1 Message-monitoring versus state-monitoring 

We distinguish two settings for tracking the progress of conversations, depending on the 
information available to the tracking agent. In the first type of setting, which we refer to 
as state monitoring, the tracking agent has access to the internal state of the conversation 
in one or more of the participants, but not necessarily to the messages being exchanged. 
The other settings involves message monitoring, where the tracking agent has access only to 
the messages being exchanged (which are externally observable), but cannot directly observe 
the internal state of the conversation in each participant. Overhearing is a form of message 
monitoring. 
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Representations that support state monitoring use places to denote the conversation 
states of the participants. Tokens placed in these places (the net marking) denote the 
current state. The sending or receiving of a message by a participant is not explicitly 
represented, and is instead implied by moving tokens (through transition firings) to the new 
state places. Thus, such a representation essentially assumes that the internal conversation 
state of participants is directly observable by the monitoring agent. Previous work utilizing 
state monitoring includes work by ? (?, ?, ?, ?, ?, ?). 

The representation we present in this paper is intended for overhearing tasks, and cannot 
assume that the conversation states of overheard agents are observable. Instead, it must 
support message monitoring, where in addition to using tokens in state places (to denote 
current conversation state), the representation uses message places, where tokens are placed 
when a corresponding message is overheard. A conversation-state place and a message 
place are connected via a transition to a state place denoting the new conversation state. 
Tokens placed in these originating places-indicating a message was received at an appropriate 
conversation state-will cause the transition to fire, and for the tokens to be placed in the 
new conversation state place. Thus the new conversation state is inferred from "observing" 
a message. Previous investigations, that have used explicit message places, include work by 
? (?, ?, ?, ?, ?, ?, ?jE These are discussed in depth below. 

2.2 Representing a Single Conversation 

Two representation variants are popular within those that utilize conversation places (in 
addition to message places): Individual state representations use separate places and tokens 
for the state of each participant (each role). Thus, the overall state of the conversation is 
represented by different tokens marking multiple places. Joint state representations use a 
single place for each joint conversation state of all participants. The placement of a token 
within such a place represents the overhearing agent's belief that the participants are in the 
appropriate joint state. 

Most previous representations use individual states. In these, different markings distin- 
guish a conversation state where one agent has sent a message, from a state where the other 
agent received it. The net for each conversation role is essentially built separately, and is 
merged with the other nets, or connected to them via fusion places or similar means. 

? (?, ?, ?) have used CP-nets with individual state places for representing KQML and 
FIPA interaction protocols. Transitions represent message events, and CP-net features, 
such as token colors and arc expressions, are used to represent AUML message attributes 
and sequence expressions. The authors also point out that deadlines (a temporal aspect 
of interaction) can be modelled, but no implementation details are provided. ? (?) also 
proposed using hierarchical CP-nets to represent hierarchical multi-agent conversations. 

? (?, ?) represented conversation roles as separate CP-nets, where places denote both 
interaction messages and states, while transitions represent operations performed on the cor- 
responding communicative acts such as send, receive, and process. Special in/out places are 
used to pass net tokens between the different CP-nets, through special get/put transitions, 
simulating the actual transmission of the corresponding communicative acts. 



2. ? (?, ?, ?) present examples of both state- and message- monitoring representations. 
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In principle, individual-state representations require two places in each role, for every 
message. For a given message, there would be two individual places for the sender (before 
sending and after sending), and similarly two more for each receiver (before receiving and 
after receiving). All possible conversation states-valid or not-can be represented. For a 
single message and two roles, there are two places for each role (four places total), and four 
possible conversation states: message sent and received, sent and not received, not sent but 
incorrectly believed to have been received, not sent and not received. These states can be 
represented by different markings. For instance, a conversation state where the message has 
been sent but not received is denoted by a token in the 'after- sending' place of the sender 
and another token in the 'before-receiving' place of the receiver. This is summarized in the 
following proposition: 

Proposition 1 Given a conversation with R roles and a total of M possible messages, an 
individual state representation has space complexity of 0{MR). 

While the representations above all represent each role's conversation state separately, 
many applications of overhearing only require representation of valid conversation states 
(message not sent and not received, or sent and received). Indeed, specifications for inter- 
action protocols often assume the use of underlying synchronization protocols to guarantee 
delivery of messages (?, ?). Under such an assumption, for every message, there are only 
two joint states regardless of the number of roles. For example, for a single message and 
three roles-a sender and two receivers, there are two places and two possible markings: A 
token in a before sending /receiving place represents a conversation state where the message 
has not yet been sent by the sender (and the two receivers are waiting for it), while a token 
in a after sending/receiving place denotes that the message has been sent and received by 
both receivers. 

? (?) utilize CP-nets where places denote joint conversation states. They also utilize 
places representing communicative acts. ? (?) proposed a representation based on Place- 
Transition nets (PT-nets)-a more restricted representation of Petri nets that has no color. 
They presented several interaction building blocks, which could then fit together to model 
additional conversation protocols. In general, the following proposition holds with respect 
to such representations: 

Proposition 2 Given a conversation with R roles and a total of M possible messages, a 
joint state representation that represents only legal states has space complexity ofO(M). 

The condition of representing only valid states is critical to the complexity analysis. If all 
joint conversation states-valid and invalid-are to be represented, the space complexity would 
beO{M R ). In such a case, an individual-state representation would have an advantage. This 
would be the case, for instance, if we do not assume the use of synchronization protocols, 
e.g., where the overhearing agent may wish to track the exact system state even while a 
message is underway (i.e., sent and not yet received). 

2.3 Representing Multiple Concurrent Conversations 

Propositions Q] and [2] above address the space complexity of representing a single conver- 
sation. However, in large scale systems an overhearing agent may be required to monitor 
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multiple conversations in parallel. For instance, an overhearing agent may be monitoring a 
middle agent that is carrying multiple parallel instances of a single interaction protocol with 
multiple partners, e.g., brokering (?). 

Some previous investigations propose to duplicate the appropriate Petri net representa- 
tion for each monitored conversation (?, ?). In this approach, every conversation is tracked 
by a separate Petri-net, and thus the number of Petri nets (and their associated tokens) 
grows with the number of conversations (Proposition [3]). For instance, ? (?) shows an ex- 
ample where a contract-net protocol is carried out with three different contractors, using 
three duplicate CP-nets. This is captured in the following proposition: 

Proposition 3 A representation that creates multiple instances of a conversation Petri net 
to represent C conversations, requires 0(C) net structures, and 0(C) bits for all tokens. 

Other investigations take a different approach, in which a single CP-net structure is used 
to monitor all conversations of the same protocol. The tokens associated with conversations 
are differentiated by their token color (?, ?, ?, ?, ?, ?, ?, ?). For example, by assigning each 
token a color of the tuple type {sender, receiver), an agent can differentiate multiple tokens 
in the same place and thus track conversations of different pairs of agents^. Color tokens 
use multiple bits per token; up to logC bits are required to differentiate C conversations. 
Therefore, the number of bits required to track C conversations using C tokens is C log C. 
This leads to the following proposition. 

Proposition 4 A representation that uses color tokens to represent C multiple instances of 
a conversation, requires 0(1) net structures, and 0(C\ogC) bits for all tokens. 

Due to the constants involved, the space requirements of Proposition [3] are in practice 
much more expensive than those of Proposition 3J Proposition [3] refers to the creation of 
0(C) Petri networks, each with duplicated place and transition data structures. In contrast, 
Proposition U refers to bits required for representing C color tokens on a single CP net. 
Moreover, in most practical settings, a sufficiently large constant bound on the number of 
conversations may be found, which will essentially reduce the O(logC) factor to 0(1). 

Based on Propositions [THU it is possible to make concrete predictions as to the scalability 
of different approaches with respect to the number of conversations, roles. Table [T] shows 
the space complexity of different approaches when modelling C conversations of the same 
protocol, each with a maximum of R roles, and M messages, under the assumption of 
underlying synchronization protocols. The table also cites relevant previous work. 

Building on the insights gained from Table Q] we propose a representation using CP-nets 
where places explicitly represent joint conversation states (corresponding to the lower-right 
cell in Table Q]) , and tokens color is used to distinguish concurrent conversations (as in the 
upper-right cell in Table [T]). As such, it is related to the works that have these features, but 
as the table demonstrates, is a novel synthesis. 

Our representation uses similar structures to those found in the works of ? (?) and ? 
(?). However, in contrast to these previous investigations, we rely on token color in CP- nets 
to model concurrent conversations, with space complexity 0(M + ClogC). We also show 

3. See Section [4] to distinguish between different conversations by the same agents. 
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Representing Multiple Conversations (of Same Protocol) 
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Table 1: Scalability of different representations 



(Sections [3HS]) how it can be used to cover a variety of conversation features not covered by 
these investigations. These features include representation of a full set of FIPA interaction 
building blocks, communicative act attributes (such as message guards, sequence expressions, 
etc.), compact modelling of concurrent conversations, nested and interleaved interactions, 
and temporal aspects. 



3. Representing Simple & Complex Interaction Building Blocks 

This section introduces the fundamentals of our representation, and demonstrates how var- 
ious simple and complex AUML interaction messages, used in FIPA conversation standards 
(?), can be implemented using the proposed CP-net representation. We begin with a simple 
conversation, shown in Figure [U-a using an AUML protocol diagram. Here, agent\ sends an 
asynchronous message msg to agents- 
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(a) AUML representation 



(b) CP-net representation 



Figure 1: Asynchronous message interaction. 



To represent agent conversation protocols, we define two types of places, corresponding 
to messages and conversation states. The first type of net places, called message places, is 
used to describe conversation communicative acts. Tokens placed in message places indicate 
that the associated communicative act has been overheard. The second type of net places, 
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agent places, is associated with the valid joint conversation states of the interacting agents. 
Tokens placed in agent places indicate the current joint state of the conversation within the 
interaction protocol. 

Transitions represent the transmission and receipt of communicative acts between agents. 
Assuming underlying synchronization protocols, a transition always originates within a joint- 
state place and a message place, and targets a joint conversation state (more than one is 
possible-see below). Normally, the current conversation state is known (marked with a 
token), and must wait the overhearing of the matching message (denoted with a token at 
the connected message place). When this token is marked, the transition fires, automatically 
marking the new conversation state. 

Figure Q]-b presents CP net representation of the earlier example of Figure QJa. The CP- 
net in Figure [T]-b has three places and one transition connecting them. The A\B\ and the 
A2B2 places are agent places, while the msg place is a message place. The A and B capital 
letters are used to denote the agent\ and the agents individual interaction states respectively 
(we have indicated the individual and the joint interaction states over the AUML diagram 
in Figure[T]-a, but omit these annotations in later figures). Thus, the A\B\ place indicates a 
joint interaction state where agent\ is ready to send the msg communicative act to agents 
(^4i) and agent2 is waiting to receive the corresponding message (Bi). The msg message 
place corresponds to the msg communicative act sent between the two agents. Thus, the 
transmission of the msg communicative act causes the agents to transition to the A2B2 
place. This place corresponds to the joint interaction state in which agent\ has already sent 
the msg communicative act to agents (^2) and agent2 has received it (B2). 

The CP-net implementation in Figure [T]-b also introduces the use of token colors to 
represent additional information about interaction states and communicative acts. The 
token color sets are defined in the net declaration, i.e. the dashed box in Figure QJb. The 
syntax follows the standard CPN ML notation (?, ?, ?). The AGENT color identifies the 
agents participating in the interaction, and is used to construct the two compound color 
sets. 

The INTER-STATE color set is associated with agent places, and represents agents in 
the appropriate joint interaction states. It is a record (ai, 02), where a\ and (12 are AGENT 
color elements distinguishing the interacting agents. We apply the INTER-STATE color 
set to model multiple concurrent conversations using the same CP-net. The second color 
set is MSG, describing interaction communicative acts and associated with message places. 
The MSG color token is a record {a s ,a r ), where a s and a r correspond to the sender and 
the receiver agents of the associated communicative act. In both cases, additional elements, 
such as conversation identification, may be used. See Section U] for additional details. 

In Figure [T]-b, the A\B\ and the A2B2 places are associated with the INTER-STATE 
color set, while the msg place is associated with the MSG color set. The place color set 
is written in italic capital letters next to the corresponding place. Furthermore, we use 
the s and r AGENT color type variables to denote the net arc expressions. Thus, given 
that the output arc expression of both the A\B\ and the msg places is (s,r), the s and r 
elements of the agent place token must correspond to the s and r elements of the message 
place token. Consequently, the net transition occurs if and only if the agents of the message 
correspond to the interacting agents. The A2B2 place input arc expression is (r, s) following 
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the underlying intuition that agents is going to send the next interaction communicative 
act. 

Figure [2]-a shows an AUML representation of another interaction building block, syn- 
chronous message passing, denoted by the filled solid arrowhead. Here, the msg commu- 
nicative act is sent synchronously from agenti to agent2, meaning that an acknowledgement 
on msg communicative act must always be received by agenti before the interaction may 
proceed. 
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(a) AUML representation 



(b) CP-net representation 



Figure 2: Synchronous message interaction. 

The corresponding CP- net representation is shown in Figure (2j-b. The interaction starts 
in the A±Bi place and terminates in the A2B2 place. The A1B1 place represents a joint 
interaction state where agenti is ready to send the msg communicative act to agenti (Ai) 
and agents is waiting to receive the corresponding message (Bi). The A2B2 place denotes 
a joint interaction state, in which agenti has already sent the msg communicative act to 
agent2 (A2) and agent2 has received it (-62)- However, since the CP-net diagram represents 
synchronous message passing, the msg communicative act transmission cannot cause the 
agents to transition directly from the A1B1 place to the A2B2 place. We therefore define an 
intermediate A' 1 B[ agent place. This place represents a joint interaction state where agent2 
has received the msg communicative act and is ready to send an acknowledgement on it 
(Bi), while agenti is waiting for that acknowledgement (A^). Taken together, the msg 
communicative act causes the agents to transition from the A1B1 place to the A' 1 B[ place, 
while the acknowledgement on the msg message causes the agents to transition from the 
A' 1 B[ place to the A2B2 place. 

Transitions in a typical multi-agent interaction protocols are composed of interaction 
building blocks, two of which have been presented above. Additional interaction building- 
blocks, which are fairly straightforward (or have appeared in previous work, e.g., ?) are 
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presented in Appendix [Bj In the remainder of this section, we present two complex interac- 
tions building blocks that are generally common in multi-agent interactions: XOR-decision 
and OR-parallel. 

We begin with the XOR-decision interaction. The AUML representation to this building 
block is shown in Figure [3]-a. The sender agent agenti can either send message msgi to 
agenti or message msg2 to agents, but it can not send both msgi and msg2- The non- filled 
diamond with an 'x' inside is the AUML notation for this constraint. 
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(a) AUML representation 
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(b) CP-net representation 



1 color AGENT = . . 
! color INTER-STATE = record 
a r AGENT*a 2 AGENT; 
i color INTER-STATE-3=record 
a r AGENT*a 2 : AGENT 
*a 3 AGENT; 
color MSG = record 

s:AGENT*r: AGENT; 
var s,r,,r 2 AGENT; 



Figure 3: XOR-decision messages interaction. 

Figure [3}-b shows the corresponding CP-net. Again, the A, B and C capital letters 
are used to denote the interaction states of agenti, agent2 and agents, respectively. The 
interaction starts from the A\B\Ci place and terminates either in the A2B2 place or in the 
A2C2 place. The A\BiC\ place represents a joint interaction state where agenti is ready to 
send either the msgi communicative act to agent2 or the msg2 communicative act to agent?, 
(^4i); and agent2 and agents are waiting to receive the corresponding msg\jmsg2 message 
{B\jC\). To represent the A\B\C\ place color set, we extend the INTER-STATE color 
set to denote a joint interaction state of three interacting agents, i.e. using the INTER- 
STATE-3 color set. The msgi communicative act causes the agents to transition to A2B2 
place. The A2B2 place represents a joint interaction state where agenti has sent the msgi 
message (^2)) an d agent2 has received it {B2). Similarly, the msg2 communicative act causes 
agents agenti an d agent?, to transition to A2C2 place. Exclusiveness is achieved since the 
single agent token in A1B1C1 place can be used either for activating the A1B1C1 — > A2B2 
transition or for activating the A1B1C1 — » A2C2 transition, but not both. 

A similar complex interaction is the OR-parallel messages interaction. Its AUML repre- 
sentation is presented in Figure S]-a. The sender agent, agenti, can send message msgi to 
agent2 or message msg2 to agent?, or both. The non- filled diamond is the AUML notation 
for this constraint. 

Figure [4]-b shows the CP-net representation of the OR-parallel interaction. The inter- 
action starts from the A1B1C1 place but it can be terminated in the A2B2 place, or in the 
A2C2 place, or in both. To represent this inclusiveness of the interaction protocol, we define 
two intermediate places, the A[Bi place and the A'[C\ place. The A\Bi place represents a 
joint interaction state where agenti is ready to send the msgi communicative act to agent2 
(Ai) and agent2 is waiting to receive the message {B\). The A'[C\ place has similar mean- 
ing, but with respect to agent?. As normally done in Petri nets, the transition connecting 
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Figure 4: OR-parallel messages interaction. 



the AiB\Ci place to the intermediate places duplicates any single token in AiB\Ci place 
into two tokens going into the A^Bi and the A'[C\ places. Consequently, the two parts of 
the OR-parallel interaction can be independently executed. 



4. Representing Interaction Attributes 

We now extend our representation to allow additional interaction aspects, useful in describing 
multi-agent conversation protocols. First, we show how to represent interaction message 
attributes, such as guards, sequence expressions, cardinalities and content (?). We then 
explore in depth the representation of multiple concurrent conversations (on the same CP 
net). 

Figure [SJ-a shows a simple agent interaction using an AUML protocol diagram. This 
interaction is similar to the one presented in Figure [T]-a in the previous section. However, 
Figure [5]-a uses an AUML message guard-condition-marked as [condition] -that has the 
following semantics: the communicative act is sent from agent\ to agents if and only if the 
condition is true. 



aqentl 



msg 



[condition] 



agent2 



//VTER- Al - Bl 




INTER- 
STATE 



i color AGENT=" . . . ; " 
color TYPE = 
color CONTENT = ... 
color INTER-STATE 



record 



a r AGENT*a 2 :AGENT; 

color MSG = record 

s:AGENT*r: AGENT* 
t:TYPE*c: CONTENT; 

var s,r:AGENT; var tTYPE; 

var c:CONTENT; 



(a) AUML representation 



(b) CP-net representation 



Figure 5: Message guard-condition 
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The guard-condition implementation in our Petri net representation uses transition 
guards (Figure [SJ-b), a native feature for CP nets. The AUML guard condition is mapped 
directly to the CP-net transition guard. The CP-net transition guard is indicated on the 
net inscription next to the corresponding transition using square brackets. The transition 
guard guarantees that the transition is enabled if and only if the transition guard is true. 

In Figure [5]-b, we also extend the color of tokens to include information about the 
communicative act being used and its content. We extend the MSG color set definition 
to a record (s, r, t, c), where the s and r elements has the same interpretation as in previous 
section (sender and receiver), and the t and c elements define the message type and content, 
respectively. The t element is of a new color TYPE, which determines communicative act 
types. The c element is of a new color CONTENT, which represents communicative act 
content and argument list (e.g. reply-to, reply-by and etc). 

The addition of new elements also allows for additional potential uses. For instance, 
to facilitate representation of multiple concurrent conversations between the same agents 
(s and r), it is possible to add a conversation identification field to both the MSG and 
INTER-STATE colors. For simplicity, we refrain from doing so in the examples in this 
paper. 

Two additional AUML communicative act attributes that can be modelled in the CP 
representation are message sequence-expression and message cardinality. The sequence- 
expressions denote a constraint on the message sent from sender agent. There are a number 
of sequence-expressions defined by FIPA conversation standards (?): m denotes that the 
message is sent exactly m times; n..m denotes that the message is sent anywhere from n up 
to m times; * denotes that the message is sent an arbitrary number of times. An additional 
important sequence expression is broadcast, i.e. message is sent to all other agents. 

We now explain the representation of sequence-expressions in CP-nets, using broadcast 
as an example (Figure[6]-b). Other sequence expressions are easily derived from this example. 
We define an INTER-STATE-CARD color set. This color set is a tuple ((a\, 02), i) consisting 
of two elements. The first tuple element is an INTER-STATE color element, which denotes 
the interacting agents as previously defined. The second tuple element is an integer that 
counts the number of messages already sent by an agent, i.e. the message cardinality. 
This element is initially assigned to 0. The INTER-STATE-CARD color set is applied to 
the S1R1 place, where the S and R capital letters are used to denote the sender and the 
receiver individual interaction states respectively and the S\R\ indicates the initial joint 
interaction state of the interacting agents. The two additional colors, used in Figure [6}-b, are 
the BROADCAST-LIST and the TARGET colors. The BROADCAST-LIST color defines 
the sender broadcast list of the designated receivers, assuming that the sender must have 
such a list to carry out its role. The TARGET color defines indexes into this broadcast list. 

According to the broadcast sequence-expression semantics, the sender agent sends the 
same msg\ communicative act to all the receivers on the broadcast list. The CP-net in- 
troduced in Figure [6]-b models this behaviorQ The interaction starts from the S\R\ place, 
representing the joint interaction state where sender is ready to send the msg\ commu- 
nicative act to receiver (Si) and receiver is waiting to receive the corresponding msgi 
message (Ri)- The S1R1 place initial marking is a single token, set by the initializa- 

4. We implement broadcast as an iterative procedure sending the corresponding communicative act sepa- 
rately to all designated recipients. 



360 



Representing Conversations for Scalable Overhearing 



sender 




receiver 









msg1 



broadcast 

msg2 



1'(<s.TARGET 

(0)>.Q) 
INTER^~~\ 

STATE- 
CARD 

(<s,TARGI 
(i+1)>,i+1 

INTER- 
STATE 

^2 R 2 <s,r>" 



INTER- 
STATE 



msg. 




'color AGENT = 
color TYPE = . . 
color CONTENT = ... 
color INTER-STATE = 



record a r AGENT* 
a 2 :AGENT; 



color CARD = int; 

color INTER-STATE-CARD = product 

INTER-STATE'CARD; 

color MSG = record s:AGENT*r:AGENT* 
t:TYPE*c:CONTENT; 

color BROADCAST-LIST = AGENT with. . . ; 

val size = ...; 

color TARGET = index BROADCAST-LIST 

with 0...size-1 ; 
'^var s,r:AGENT; var msg:MSG; vari:CARD v i 



(a) AUML representation 



(b) CP-net representation 



Figure 6: Broadcast sequence expression. 



tion expression (underlined, next to the corresponding place). The initialization expres- 
sion TARGET(0)), 0)— given in standard CPN ML notation-determines that the S±Ri 
place's initial marking is a multi-set containing a single token ({s,TARGET(0)),0). Thus, 
the first designated receiver is assigned to be the agent with index on the broadcast list, 
and the message cardinality counter is initiated to 0. 

The msgi message place initially contains multiple tokens. Each of these tokens rep- 
resents the msgi communicative act addressed to a different designated receiver on the 
broadcast list. In Figure [6]-b, the initialization expression, corresponding to the msgi mes- 
sage place, has been omitted. The S\R\ place token and the appropriate msgi place token 
together enable the corresponding transition. Consequently, the transition may fire and thus 
the msgi communicative act transmission is simulated. 

The msgi communicative act is sent incrementally to every designated receiver on the 
broadcast list. The incoming arc expression ((s, r),i) is incremented by the transition to 
the outgoing ((s,TARGET(i + + 1) arc expression, causing the receiver agent with 
index i + 1 on the broadcast list to be selected. The transition guard constraint i < size, 
i.e. i < {broadcast list\, ensures that the msgi message is sent no more than {broadcast list\ 
times. The msgi communicative act causes the agents to transition to the S2R2 place. 
This place represents a joint interaction state in which sender has already sent the msgi 
communicative act to receiver and is now waiting to receive the msg2 message (S2) and 
receiver has received the msgi message and is ready to send the msg2 communicative act 
to sender (i?2)- Finally, the msg2 message causes the agents to transition to the S3-R3 
place. The S3-R3 place denotes a joint interaction state where sender has received the msg2 
communicative act from receiver and terminated (S3), while receiver has already sent the 
msg2 message to sender and terminated as well (Rs). 

We use Figure [6]-b to demonstrate the use of token color to represent multiple concurrent 
conversations using the same CP-net. For instance, let us assume that the sender agent is 
called agenti and its broadcast list contains the following agents: agent2, agents, agents, 
agents an d agents . We will also assume that the agenti has already sent the msgi com- 
municative act to all agents on the broadcast list. However, it has only received the msg2 
reply message from agents and agentg. Thus, the CP-net current marking for the complete 
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interaction protocol is described as follows: the S2R2 place is marked by (agent2, agenti), 
(agents, agenti), (agents, agenti) , while the S3-R3 place contains the tokens {agenti, agents) 
and (agenti, agent§) . 



FIPA-Query-Protocol J 
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Figure 7: FIPA Query Interaction Protocol - AUML representation. 



An Example. We now construct a CP-net representation of the FIPA Query Interaction 
Protocol (?), shown in AUML form in Figure [71 to demonstrate how the building blocks 
presented in Sections [3] and [4] can be put together. In this interaction protocol, the Initiator 
requests the Participant to perform an inform action using one of two query communicative 
acts, query-if or query-ref. The Participant processes the query and makes a decision 
whether to accept or refuse the query request. The Initiator may request the Participant 
to respond with either an accept or refuse message, and for simplicity, we will assume 
that this is always the case. In case the query request has been accepted, the Participant 
informs the Initiator on the query results. If the Participant fails, then it communicates a 
failure. In a successful response, the Participant replies with one of two versions of inform 
(inform-t/f or inform-result) depending on the type of initial query request. 

The CP-net representation of the FIPA Query Interaction Protocol is presented in Fig- 
ure [H The interaction starts in the I\P\ place (we use the / and the P capital letters 
to denote the Initiator and the Participant roles). The I\P\ place represents a joint 
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interaction state where (i) the Initiator agent is ready to send either the query-if commu- 
nicative act, or the query-ref message, to Participant (I\); and (ii) Participant is wait- 
ing to receive the corresponding message (-Pi). The Initiator can send either a query-if 
or a query-ref communicative act. We assume that these acts belong to the same class, 
the query communicative act class. Thus, we implement both messages using a single 
Query message place, and check the message type using the following transition guard: 
[#t msg = query-if or #t msg = query-ref\. The query communicative act causes the 
interacting agents to transition to the I2P2 place. This place represents a joint interaction 
state in which Initiator has sent the query communicative act and is waiting to receive 
a response message (I2), and Participant has received the query communicative act and 
deciding whether to send an agree or a refuse response message to Initiator {P-z). The 
refuse communicative act causes the agents to transition to /3P3 place, while the agree 
message causes the agents to transition to -Z4-P4 place. 
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Figure 8: FIPA Query Interaction Protocol - CP-net representation. 

The Participant decision on whether to send an agree or a refuse communicative 
act is represented using the XOR-decision building block introduced earlier (Figure [3}-b) . 
The /3P3 place represents a joint interaction state where Initiator has received a refuse 
communicative act and terminated (I3) and Participant has sent a refuse message and 
terminated as well (P3). The /4P4 place represents a joint interaction state in which Initiator 
has received an agree communicative act and is now waiting for further response from 
Participant (I4) and Participant has sent an agree message and is now deciding which 
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response to send to Initiator (P4). At this point, the Participant agent may send one 
of the following communicative acts: inform-t/f, inform-result and failure. The choice is 
represented using another XOR-decision building block, where the inform-t/f and inform- 
result communicative acts are represented using a single Inform message place. The failure 
communicative act causes a transition to the ^5^5 place, while the inform message causes 
a transition to the IqPq place. The I5P5 place represents a joint interaction state where 
Participant has sent a failure message and terminated (-P5), while Initiator has received 
a failure and terminated (Is). The IqPq place represents a joint interaction state in which 
Participant has sent an inform message and terminated (Pq), while Initiator has received 
an inform and terminated (Jg). 

The implementation of the [query-if] and the [query-ref] message guard conditions re- 
quires a detailed discussion. These are not implemented in a usual manner in view of the fact 
that they depend on the original request communicative act. Thus, we create a special in- 
termediate place that contains the original message type marked "Original Message Type" 
in the figure. In case an inform communicative act is sent, the transition guard verifies 
that the inform message is appropriate to the original query type. Thus, an inform-t/f 
communicative act can be sent only if the original query type has been query-if and an 
inform-result message can be sent only if the original query type has been query-ref. 

5. Representing Nested & Interleaved Interactions 

In this section, we extend the CP-net representation of previous sections to model nested 
and interleaved interaction protocols. We focus here on nested interaction protocols. Never- 
theless, the discussion can also be addressed to interleaved interaction protocols in a similar 
fashion. 

FIPA conversation standards (?) emphasize the importance of nested and interleaved 
protocols in modelling complex interactions. First, this allows re-use of interaction protocols 
in different nested interactions. Second, nesting increases the readability of interaction 
protocols. 

The AUML notation annotates nested and interleaved protocols as round corner rectan- 
gles (?, ?). Figure [9]-a shows an example of a nested protocol!, while Figure [9]-b illustrates 
an interleaved protocol. Nested protocols have one or more compartments. The first com- 
partment is the name compartment. The name compartment holds the (optional) name of 
the nested protocol. The nested protocol name is written in the upper left-hand corner of 
the rectangle, i.e. commitment in Figure [9]-a. The second compartment, the guard com- 
partment, holds the (optional) nested protocol guard. The guard compartment is written 
in the lower left-hand corner of the rectangle, e.g. [commit] in Figure [9]-a. Nested protocols 
without guards are equivalent to nested protocols with the [true] guard. 

Figure [10] describes the implementation of the nested interaction protocol presented in 
Figure [9}-a by extending the CP-net representation to using hierarchies, relying on stan- 
dard CP-net methods (see Appendix [A]) . The hierarchical CP-net representation contains 
three elements: a superpage, a subpage and a page hierarchy graph. The CP-net superpage 
represents the main interaction protocol containing a nested interaction, while the CP-net 

5. Figure[9]-a appears in FIPA conversation standards (?). Nonetheless, note that the request-good and the 
request-pay communicative acts are not part of the FIPA-ACL standards. 
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Figure 9: AUML nested and interleaved protocols examples. 

subpage models the corresponding nested interaction protocol, i.e. the Commitment Inter- 
action Protocol. The page hierarchy graph describes how the superpage is decomposed into 
subpages. 
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Figure 10: Nested protocol implementation using hierarchical CP-nets. 

Let us consider in detail the process of modelling the nested interaction protocol in 
Figure[9]-a using a hierarchical CP-net, resulting in the net described in Figure LTOl First, we 
identify the starting and ending points of the nested interaction protocol. The starting point 
of the nested interaction protocol is where Buyer± sends a Request- Good communicative act 
to Selleri. The ending point is where Buyer\ receives a Request-Pay communicative act 
from Seller\. We model these nested protocol end-points as CP-net socket nodes on the 
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superpage, i.e. Main Interaction Protocol: BuSn and Request-Good are input socket 
nodes and B\ 3 Si 3 is an output socket node. 

The nested interaction protocol, the Commitment Interaction Protocol, is represented 
using a separate CP- net, following the principles outlined in Sections [3] and [U This net 
is a subpage of the main interaction protocol superpage. The nested interaction protocol 
starting and ending points on the subpage correspond to the net port nodes. The E>iSi and 
Request- Good places are the subpage input port nodes, while the B 3 S 3 place is an output 
port node. These nodes are tagged with the IN/OUT port type tags correspondingly. 

Then, a substitution transition, which is denoted using HS (Hierarchy and Substitu- 
tion), connects the corresponding socket places on the superpage. The substitution tran- 
sition conceals the nested interaction protocol implementation from the net superpage, i.e. 
the Main Interaction Protocol. The nested protocol name and guard compartments are 
mapped directly to the substitution transition name and guard respectively. Consequently, 
in FigurePTUlwe define the substitution transition name as Commitment and the substitution 
guard is determined to be [commit]. 

The superpage and subpage interface is provided using the hierarchy inscription. The 
hierarchy inscription is indicated using the dashed box next to the substitution transi- 
tion. The first line in the hierarchy inscription determines the subpage identity, i.e. the 
Commitment Interaction Protocol in our example. Moreover, it indicates that the substi- 
tution transition replaces the corresponding subpage detailed implementation on the super- 
page. The remaining hierarchy inscription lines introduce the superpage and subpage port 
assignment. The port assignment relates a socket node on the superpage with a port node 
on the subpage. The substitution transition input socket nodes are related to the IN-tagged 
port nodes. Analogously, the substitution transition output socket nodes correspond to the 
OUT-tagged port nodes. Therefore, the port assignment in Figure [10] assigns the net socket 
and port nodes in the following fashion: BuSn to B\S\, Request-Good to Request-Good 
and B 13 S 13 to B 3 S 3 . 

Finally, the page hierarchy graph describes the decomposition hierarchy (nesting) of 
the different protocols (pages). The CP-net pages, the Main Interaction Protocol and 
the Commitment Interaction Protocol, correspond to the page hierarchy graph nodes 
(Figure [TO]) . The arc inscription indicates the substitution transition, i.e. Commitment. 

6. Representing Temporal Aspects of Interactions 

Two temporal interaction aspects are specified by FIPA (?). In this section, we show how 
timed CP-nets (see also Appendix [A]) can be applied for modelling agent interactions that 
involve temporal aspects, such as interaction duration, deadlines for message exchange, etc. 

A first aspect, duration, is the interaction activity time period. Two periods can be 
distinguished: transmission time and response time. The transmission time indicates the 
time interval during which a communicative act, is sent by one agent and received by the 
designated receiver agent. The response time period denotes the time interval in which 
the corresponding receiver agent is performing some task as a response to the incoming 
communicative act. 

The second temporal aspect is deadlines. Deadlines denote the time limit by which 
a communicative act must be sent. Otherwise, the corresponding communicative act is 
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considered to be invalid. These issues have not been addressed in previous investigations 
related to agent interactions modelling using Petri netsj§ 

We propose to utilize timed CP-nets techniques to represent these temporal aspects of 
agent interactions. In doing so, we assume a global clock|3 We begin with deadlines. Fig- 
ure [TT]-a introduces the AUML representation of message deadlines. The deadline keyword 
is a variation of the communicative act sequence expressions described in Section [4] It 
sets a time constraint on the start of the transmission of the associated communicative act. 
In Figure [TTTa. agent\ must send the msg communicative act to agents before the defined 
deadline. Once the deadline expires, the msg communicative act is considered to be invalid. 
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(a) AUML representation 
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Figure 11: Deadline sequence expression. 

Figure \TT[b shows a timed CP-net implementation of the deadline sequence expression. 
The timed CP-net in Figure \TT[b defines an additional MSG-TIME color set associated with 
the net message places. The MSG-TIME color set extends the MSG color set, described in 
Section HI by adding a time stamp attribute to the message token. Thus, the communicative 
act token is a record [s,r,t,c)@[Tts]. The @[..] expression denotes the corresponding token 
time stamp, whereas the token time value is indicated starting with a capital ! T'. Accord- 
ingly, the described message token has a ts time stamp. The communicative act time limit 
is defined using the val deadline parameter. Therefore, the deadline sequence expression 
semantics is simulated using the following transition guard: [Tts < T deadline]. This tran- 
sition guard, comparing the msg time stamp against the deadline parameter, guarantees 
that an expired msg communicative act can not be received. 

We now turn to representing interaction duration. The AUML representation is shown in 
Figure [121-a. The AUML time intensive message notation is used to denote the communica- 
tive act transmission time. As a rule communicative act arrows are illustrated horizontally. 
This indicates that the message transmission time can be neglected. However, in case the 
message transmission time is significant, the communicative act is drawn slanted downwards. 
The vertical distance, between the arrowhead and the arrow tail, denotes the message trans- 

6. ? (?, ?) mention deadlines without presenting any implementation details. 

7. Implementing it, we can use the private clock of an overhearing agent as the global clock for our Petri 
net representation. Thus, the time stamp of the message is the overhearer's time when the corresponding 
message was overheard. 
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mission time. Thus, the communicative act msgi, sent from agent\ to agent^-, has a t\ 
transmission time. 
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Figure 12: Interaction duration. 



The response time in Figure [T2]-a is indicated through the interaction thread length. 
The incoming msgi communicative act causes agenti to perform some task before sending 
a response msg<i message. The corresponding interaction thread duration is denoted through 
the £2 time period. Thus, this time period specifies the agent2 response time to the incoming 
msgi communicative act. 

The CP-net implementation to the interaction duration time periods is shown in Fig- 
ure \T%b. The communicative act transmission time is illustrated using the timed CP-nets 
@+ operator. The net transitions simulate the communicative act transmission between 
agents. Therefore, representing a transmission time of t±, the CP-net transition adds a t\ 
time period to the incoming message token time stamp. Accordingly, the transition @ + Tt\ 
output arc expression denotes a t\ delay to the time stamp of the outgoing token. Thus, 
the corresponding transition takes t\ time units and consequently so does the msgi commu- 
nicative act transmission time. 

In contrast to communicative act transmission time, the agent interaction response time 
is represented implicitly. Previously, we have defined a MSG-TIME color set that indicates 
message token time stamps. Analogously, in Figure \l2lb we introduce an additional INTER- 
STATE-TIME color set. This color set is associated with the net agent places and it presents 
the possibility to attach time stamps to agent tokens as well. Now, let us assume that A2B2 
and msg2 places contain a single token each. The circled T' next to the corresponding place, 
together with the multi-set inscription, indicates the place current marking. Thus, the agent 
and the message place tokens have a ts\ and a ts2 time stamps respectively. The ts\ time 
stamp denotes the time by which agent2 has received the msgi communicative act sent 
by agent\. The ts2 time stamp indicates the time by which agent2 is ready to send msg2 
response message to agenti. Thus, the agent2 response time £2 (Figure [T2]-a) is ts2 — ts\. 
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7. Algorithm and a Concluding Example 

Our final contribution in this paper is a skeleton procedure for transforming an AUML 
conversation protocol diagram of two interacting agents to its CP-net representation. The 
procedure is semi-automated-it relies on the human to fill in some details-but also has 
automated aspects. We apply this procedure on a complex multi-agent conversation protocol 
that involves many of the interaction building blocks already discussed. 

The procedure is shown in Algorithm [TJ The algorithm input is an AUML protocol 
diagram and the algorithm creates, as an output, a corresponding CP-net representation. 
The CP-net is constructed in iterations using a queue. The algorithm essentially creates the 
conversation net by exploring the interaction protocol breadth-first while avoiding cycles. 



Algorithm 1 Create Conversation Net(input:AU M L,output:C PN) 

1: S <— new queue 

2: CPN <r- new CP - net 
3: 

4: A±Bi <— new agent place with color information 

5: S.enqueue{A\B{) 

6: 

7: while S not empty do 

8: curr <— S.dequeue() 
9: 

10: MP «- CreateMessagePlaces{AUML,curr) 

11: RP <— CreateResultingAgentPlaces(AUML, curr, MP) 

12: (TR, AR) i- CreateTransitionsAndArcs(AUML, curr, MP, RP) 

13: FixColor(AUML, CPN, MP, RP, TR, AR) 

14: 

15: for each place p in RP do 

16: if p was not created in current iteration then 

17: continue 

18: if p is not terminating place then 

19: S.enqueue(p) 

20: 

21: CPN. places = CPN. places \J MP \J RP 

22: CP. transitions = CPN. transitions \JTR 

23: CPN. arcs = CPN. arcs (J AR 

24: 

25: return CPN 



Lines 1-2 create and initiate the algorithm queue, and the output CP-net, respectively. 
The queue, denoted by S, holds the initiating agent places of the current iteration. These 
places correspond to interaction states that initiate further conversation between the in- 
teracting agents. In lines 4-5, an initial agent place A\B\ is created and inserted into the 
queue. The ^4i-Bi place represents a joint initial interaction state for the two agents. Lines 
7-23 contain the main loop. 
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We enter the main loop in line 8 and set the curr variable to the first initiating agent 
place in S queue. Lines 10-13 create the CP-net components corresponding to the current 
iteration as follows. First, in line 10, message places, associated with curr agent place, are 
created using the CreateMessagePlaces procedure (which we do not detail here). This 
procedure extracts the communicative acts that are associated with a given interaction 
state, from the AUML diagram. These places correspond to communicative acts, which 
take agents from the joint interaction state curr to its successor (s). Then in line 11, the 
CreateResultingAgentPlaces procedure creates agent places that correspond to interaction 
state changes as a result of the communicative acts associated with curr agent place (again 
based on the AUML diagram). Then, in CreateTransitions And Arcs procedure (line 12), 
these places are connected using the principles described in Sections (3rEl Thus, the CP-net 
structure (net places, transitions and arcs) is created. Finally, in line 13, the FixColor pro- 
cedure adds token color elements to the CP-net structure, to support deadlines, cardinality, 
and other communicative act attributes. 

Lines 15-19 determine which resulting agent places are inserted into the S queue for 
further iteration. Only non-terminating agent places, i.e. places that do not correspond to 
interaction states that terminate the interaction, are inserted into the queue in lines 18-19. 
However, there is one exception (lines 16-17): a resulting agent place, which has already been 
handled by the algorithm, is not inserted back into the S queue since inserting it can cause 
an infinite loop. Thereafter, completing the current iteration, the output CP-net, denoted 
by CPN variable, is updated according to the current iteration CP-net components in lines 
21-23. This main loop iterates as long as the S queue is not empty. The resulting CP-net is 
returned-line 25. 

To demonstrate this algorithm, we will now use it on the FIPA Contract Net Interac- 
tion Protocol (?) (Figure [T3j) , This protocol allows interacting agents to negotiate. The 
Initiator agent issues m calls for proposals using a cfp communicative act. Each of the m 
Participants may refuse or counter-propose by a given deadline sending either a refuse or 
a propose message respectively. A refuse message terminates the interaction. In contrast, 
a propose message continues the corresponding interaction. 

Once the deadline expires, the Initiator does not accept any further Participant re- 
sponse messages. It evaluates the received Participant proposals and selects one, several, 
or no agents to perform the requested task. Accepted proposal result in the sending of 
accept-proposal messages, while the remaining proposals are rejected using reject-proposal 
message. Reject-proposal terminates the interaction with the corresponding Participant. 
On the other hand, the accept-proposal message commits a Participant to perform the re- 
quested task. On successful completion, Participant informs Initiator sending either an 
inform-done or an inform-result communicative act. However, in case a Participant has 
failed to accomplish the task, it communicates a failure message. 

We now use the algorithm introduced above to create a CP-net, which represents the 
FIPA Contract Net Interaction Protocol. The corresponding CP-net model is constructed in 
four iterations of the algorithm. Figure [Ml shows the CP-net representation after the second 
iteration of the algorithm, while Figure [TBI shows the CP-net representation after the fourth 
and final iteration. 

The Contract Net Interaction Protocol starts from I\P\ place, which represents a joint in- 
teraction state where Initiator is ready to send a cfp communicative act (Ii) and Participant 



370 



Representing Conversations for Scalable Overhearing 



Fl PA-ConlraetNet-P rotoool 



7 



Initiator 



Participant 



cfp 



m 

— ^ 



1 n refuse 



0*" 



dead- 
A Una 



^~ n- ' propose 



reject-proposal k 2 j^ j 



accept-proposal H~k 



failure 



inform-done:inform 



inform-result: inform 



Figure 13: FIPA Contract Net Interaction Protocol using AUML. 



is waiting for the corresponding cfp message (-Pi). The I\P\ place is created and inserted 
into the queue before the iterations through the main loop begin. 

First iteration. The curr variable is set to the I\P\ place. The algorithm creates 
net places, which are associated with the I\P\ place, i.e. a Cfp message place, and an 
I2P2 resulting agent place. The I2P2 place denotes an interaction state in which Initiator 
has already sent a cfp communicative act to Participant and is now waiting for its re- 
sponse (I2) and Participant has received the cfp message and is now deciding on an 
appropriate response {P^)- These are created using the CreateMessagePlaces and the 
CreateResultingAgentPlaces procedures, respectively. 

Then, the CreateTransitions And Arcs procedure in line 12, connects the three places 
using a simple asynchronous message building block as shown in Figure [T]-b (Section [3]). 
In line 13, as the color sets of the places are determined, the algorithm also handles the 
cardinality of the cfp communicative act, by putting an appropriate sequence expression on 
the transition, using the principles presented in Figure [6]- b (Section H]). Accordingly, the 
color set, associated with I\P\ place, is changed to the INTER-STATE-CARD color set. 
Since the I2P2 place is not a terminating place, it is inserted into the S queue. 

Second iteration, curr is set to the I2P2 place. The Participant agent can send, as a 
response, either a refuse or a propose communicative act. Refuse and Propose message 
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places are created by CreateMessagePlaces (line 10), and resulting places /3P3 and /4P4, 
corresponding to the results of the refuse and propose communicative acts, respectively, 
are created by CreateResultingAgentPlaces (line 11). The I3P3 place represents a joint 
interaction state where Participant has sent the refuse message and terminated (-P3), while 
Initiator has received it, and terminated {I3). The I4P4 place represents the joint state in 
which Participant has sent the propose message (-P4), while Initiator has received the 
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Figure 14: FIPA Contract Net Interaction Protocol using CP-net after the 2 nd iteration. 

In line 12, the I2P2, Refuse, I3P3, Propose and /4P4 places are connected using the 
XOR-decision building block presented in Figure [3]-b (Section [3]). Then, the FixColor 
procedure (line 13), adds the appropriate token color attributes, to allow a deadline sequence 
expression (on both the refuse and the propose messages) to be implemented as shown in 
Figure fTTl-b (Section [6]). The I3P3 place denotes a terminating state, whereas the /4-P4 
place continues the interaction. Thus, in lines 18-19, only the Z4-P4 place is inserted into the 
queue, for the next iteration of the algorithm. The state of the net at the end of the second 
iteration of the algorithm is presented in Figure Q3J 

Third iteration, curr is set to /4-P4. Here, the Initiator response to a Participant 
proposal can either be an accept-proposal or a reject-proposal. CreateMessagePlaces proce- 
dure in line 10 thus creates the corresponding Accept-Proposal and Reject-Proposal message 
places. The accept-proposal and reject-proposal messages cause the interacting agents to 
transition to -Z5-P5 and IqPq places, respectively. These agent places are created using the 
CreateResultingAgentPlaces procedure (line 11). The I5P5 place denotes an interaction 
state in which Initiator has sent a reject-proposal message and terminated the interac- 
tion (I5), while the Participant has received the message and terminated as well (-P5). In 
contrast, the IqPq place represents an interaction state where Initiator has sent an accept- 
proposal message and is waiting for a response (lg), while Participant has received the 
accept-proposal communicative act and is now performing the requested task before sending 
a response (Pq). The Initiator agent sends exclusively either an accept-proposal or a reject- 
proposal message. Thus, the /4P4, Reject-Proposal, I5P5, Accept-Proposal and IqPq places 



message and is considering its response (-Z4). 
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are connected using a XOR-decision block (in the CreateTransitions And Arcs procedure, 
line 12). 

The FixColor procedure in line 13 operates now as follows: According to the interaction 
protocol semantics, the Initiator agent evaluates all the received Participant proposals once 
the deadline passes. Only thereafter, the appropriate reject-proposal and accept-proposal 
communicative acts are sent. Thus, FixColor assigns a MSG-TIME color set to the Reject- 
Proposal and the Accept- Proposal message places, and creates a [Tts >= Tdeadline] tran- 
sition guard on the associated transitions. This transition guard guarantees that Initiator 
cannot send any response until the deadline expires, and all valid Participant responses 
have been received. The resulting I5P5 agent place denotes a terminating interaction state, 
whereas the IqPq agent place continues the interaction. Thus, only IqPq agent place is 
inserted into the S queue. 

Fourth iteration, curr is set to IqPq. This place is associated with three commu- 
nicative acts: inform- done , inform-result and failure. The inform-done and the inform- 
result messages are instances of the inform communicative act class. Thus, CreateMes- 
sagePlaces (line 10) creates only two message places, Inform and Failure. In line 11, 
CreateResultingAgentPlaces creates the I7P7 and IgPs agent places. The failure com- 
municative act causes interacting agents to transition to I7P7 agent place, while both inform 
messages cause the agents to transition to IgPs agent place. The I7P7 place represents a 
joint interaction state where Participant has sent the failure message and terminated (P7), 
while Initiator has received a failure communicative act and terminated (I7). On the other 
hand, the IgPs place denotes an interaction state in which Participant has sent the inform 
message (either inform-done or inform-result) and terminated (Ps), while Initiator has 
received an inform communicative act and terminated (Is)- The inform and failure com- 
municative acts are sent exclusively. Thus CreateTransitions And Arcs (line 12) connects 
the IqPq, Failure, I7P7, Inform and IsPs places using a XOR-decision building block. 
Then, FixColor assigns a [fit msg = inform-done or #t msg = inform-result] transition 
guard on the transition associated with Inform message place. Since both the I7P7 and 
the IgPg agent places represent terminating interaction states, they are not inserted into the 
queue, which remains empty at the end of the current iteration. This signifies the end of the 
conversion. The complete conversation CP-net resulting after this iteration of the algorithm 
is shown in Figure LT5l 

The procedure we outline can guide the conversion of many 2- agent conversation pro- 
tocols in AUML to their CP-net equivalents. However, it is not sufficiently developed to 
address the general n-agent case. Appendix [C]presents a complex example of a 3-agent con- 
versation protocol, which was successfully converted manually, without the guidance of the 
algorithm. This example incorporates many advanced features of our CP-net representation 
technique and would have been beyond the scope of many previous investigations. 

8. Summary and Conclusions 

Over recent years, open distributed MAS applications have gained broad acceptance both 
in the multi-agent academic community and in real- world industry. As a result, increasing 
attention has been directed to multi-agent conversation representation techniques. In par- 
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Figure 15: FIPA Contract Net Interaction Protocol using CP-net after the 4 (and final) 
iteration. 



ticular, Petri nets have recently been shown to provide a viable representation approach (?, 

? ? ?) 

However, radically different approaches have been proposed to using Petri nets for mod- 
elling multi- agent conversations. Yet, the relative strengths and weaknesses of the proposed 
techniques have not been examined. Our work introduces a novel classification of previ- 
ous investigations and then compares these investigations addressing their scalability and 
appropriateness for overhearing tasks. 

Based on the insights gained from the analysis, we have developed a novel representation, 
that uses CP- nets in which places explicitly represent joint interaction states and messages. 
This representation technique offers significant improvements (compared to previous ap- 
proaches) in terms of scalability, and is particularly suitable for monitoring via overhearing. 
We systematically show how this representation covers essentially all the features required 
to model complex multi-agent conversations, as defined by the FIPA conversation standards 
(?). These include simple & complex interaction building blocks (Section [3] & Appendix IB]) . 
communicative act attributes and multiple concurrent conversations using the same CP-net 
(Section 0]), nested & interleaved interactions using hierarchical CP- nets (Section [5]) and 
temporal interaction attributes using timed CP-nets (Section [6]). The developed techniques 
have been demonstrated, throughout the paper, on complex interaction protocols defined in 
the FIPA conversation standards (see in particular the example presented in Appendix ICj) . 
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Previous approaches could handle some of these examples (though with reduced scalability) , 
but only a few were shown to cover all the required features. 

Finally, the paper presented a skeleton procedure for semi-automatically converting an 
AUML protocol diagrams (the chosen FIPA representation standard) to an equivalent CP- 
net representation. We have demonstrated its use on a challenging FIPA conversation pro- 
tocol, which was difficult to represent using previous approaches. 

We believe that this work can assist and motivate continuing research on multi-agent 
conversations including such issues as performance analysis, validation and verification (?), 
agent conversation visualization, automated monitoring (?, ?, ?), deadlock detection (?), 
debugging (?) and dynamic interpretation of interaction protocols (?, ?). Naturally, some 
issues remain open for future work. For example, the presented procedure addresses only 
AUML protocol diagrams representing two agent roles. We plan to investigate an n-agent 
version in the future. 
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Appendix A. A Brief Introduction to Petri Nets 

Petri nets (?) are a widespread, established methodology for representing and reasoning 
about distributed systems, combining a graphical representation with a comprehensive math- 
ematical theory. One version of Petri nets is called Place/Transition nets (PT-nets) (?). 
A PT-net is a bipartite directed graph where each node is either a place or a transition 
(Figure PT6|) . The net places and transitions are indicated through circles and rectangles 
respectively. The PT-net arcs support only place — > transition and transition — > place con- 
nections, but never connections between two places or between two transitions. The arc 
direction determines the input/output characteristics of the place and the transition con- 
nected. Thus, given an arc, P — > T, connecting place P and transition T, we will say that 
place P is an input place of transition T and vice versa transition T is an output transition 
of place P. The P — > T arc is considered to be an output arc of place P and an input arc 
of transition T. 

A PT-net place may be marked by small black dots called tokens. The arc expression is 
an integer, which determines the number of tokens associated with the corresponding arc. 
By convention, an arc expression equal to 1 is omitted. A specific transition is enabled if 
and only if its input places marking satisfies the appropriate arc expressions. For example, 
consider arc P — > T to be the only arc to connect place P and transition T. Thus, given 
that this arc has an arc expression 2, we will say that transition T is enabled if and only 
if place P is marked with two tokens. In case the transition is enabled, it may fire/occur. 
The transition occurrence removes tokens from the transition input places and puts tokens 
to the transition output places as specified by the arc expressions of the corresponding 
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Figure 16: A PT-net example. 

input/output arcs. Thus, in Figures [ToT-a and [T6l-b. we demonstrate PT-net marking before 
and after transition firing respectively. 

Although computationally equivalent, a different version of Petri nets, called Colored 
Petri nets (CP-nets) (?, ?, ?), offers greater flexibility in compactly representing complex 
systems. Similarly to the PT-net model, CP-nets consist of net places, net transitions and 
arcs connecting them. However, in CP-nets, tokens are not just single bits, but can be 
complex, structured, information carriers. The type of additional information carried by the 
token, is called token color, and it may be simple (e.g., an integer or a string), or complex 
(e.g. a record or a tuple). Each place is declared by a place color set to only match tokens 
of particular colors. A CP-net place marking is a token multi-set (i.e., a set in which a 
member may appear more than once) corresponding to the appropriate place color set. CP- 
net arcs pass token multi-sets between the places and transitions. CP-net arc expressions 
can evaluate token multi-sets and may involve complex calculation procedures over token 
variables declared to be associated with the corresponding arcs. 

The CP-net model introduces additional extensions to PT-nets. Transition guards are 
boolean expressions, which constrain transition firings. A transition guard associated with 
a transition tests tokens that pass through a transition, and will only enable the transition 
firings if the guard is successfully matched (i.e., the test evaluates to true). The CP- net 
transition guards, together with places color sets and arc expressions, appear as a part of 
net inscriptions in the CP-net. 

In order to visualize and manage the complexity of large CP-nets, hierarchical CP-nets 
(?, ?) allow hierarchical representations of CP-nets, in which sub-CP nets can be re-used 
in higher-level CP nets, or abstracted away from them. Hierarchical CP-nets are built from 
pages, which are themselves CP-nets. Superpages present a higher level of hierarchy, and 
are CP- nets that refer to subpages, in addition to transitions and places. A subpage may 
also function as a superpage to other subpages. This way, multiple hierarchy levels can be 
used in a hierarchical CP-net structure. 

The relationship between a superpage and a subpage is defined by a substitution transi- 
tion, which substitutes a corresponding subpage instance on the CP-net superpage structure 
as a transition in the superpage. The substitution transition hierarchy inscription supplies 
the exact mapping of the superpage places connected to the substitution transition (called 
socket nodes), to the subpage places (called port nodes). The port types determine the 
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characteristics of the socket node to port node mappings. A complete CP-net hierarchical 
structure is presented using a page hierarchy graph, a directed graph where vertices corre- 
spond to pages, and directed edges correspond to direct superpage-subpage relationships. 

Timed CP-nets (?) extend CP-nets to support the representation of temporal aspects 
using a global clock. Timed CP-net tokens have an additional color attribute called time 
stamp, which refers to the earliest time at which the token may be used. Time stamps can 
be used by arc expression and transition guards, to enable a timed-transition if and only if 
it satisfies two conditions: (i) the transition is color enabled, i.e. it satisfies the constraints 
defined by arc expression and transition guards; and (ii) the tokens are ready, i.e. the time 
of the global clock is equal to or greater than the tokens' time stamps. Only then can the 
transition fire. 



Appendix B. Additional Examples of Conversation Representation 
Building Blocks 

This appendix presents some additional interaction building blocks to those already de- 
scribed in Section [3j The first is the AND-parallel messages interaction (AUML represen- 
tation shown in Figure [TTT-a). Here, the sender agenti sends both the msg\ message to 
agent2 and the msgi message to agent^. However, the order of the two communicative acts 
is unconstrained. 
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Figure 17: AND-parallel messages interaction. 
The representation of AND-parallel in our CP-net representation is shown in Figure [TTl-b. 
The A\B\C\, A2B2, A2C2, msgi and ms<?2 places are defined similarly to Figures [SJ-b and 
HJ-b in Section [3j However, we also define two additional intermediate agent places, A'-^B^Cx 
and N[B\C2- The A^i^Ci place represents a joint interaction state where agenti has sent 
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the msgi message to agent? and is ready to send the msg? communicative act to agent% 
(Ai), agent? has received the msgi message (B?) and agents is waiting to receive the msg? 
communicative act {C\). The A'{B\C? place represents a joint interaction state in which 
agenti is ready to send the msgi message to agent? and has already sent the msg? commu- 
nicative act to agent?, (A'{), agent? is waiting to receive the msgi message (Bi) and agents 
has received the msg? communicative act (C?). These places enable agenti to send both 
communicative acts concurrently. Four transitions connect the appropriate places respec- 
tively. The behavior of the transitions connecting A' 1 B?Ci — > A?B? and A'(BiC? — > A?C? 
is similar to described above. The transitions A1B1C1 — > A' l B?Ci and A1B1C1 — > A'(BiC? 
are triggered by receiving messages msgi an d msg?, respectively. However, these transi- 
tions should not consume the message token since it is used further for triggering transitions 
A' 1 B?Ci — > A?B? and A'(BiC? — > A?C?. This is achieved by adding an appropriate message 
place as an output place of the corresponding transition. 

The second AUML interaction building block, shown in Figure [TU-a, is the message 
sequence interaction, which is similar to AND-parallel. However, the message sequence 
interaction defines explicitly the order between the transmitted messages. Using the 1/msgi 
and 2 1 msg? notation, Figure [T8l-a specifies that the msgi message should be sent before 
sending msg?. 
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(a) AUML representation 



(b) CP-net representation 



Figure 18: Sequence messages interaction. 

Figure \l8lb shows the corresponding CP-net representation. The A1B1C1, A?B?, A?C?, 
msgi an d msg? places are defined as before. However, the CP-net implementation presents 
an additional intermediate agent place-^i^Cl-which is identical to the corresponding 
intermediate agent place in Figure [TTJ-b. A' 1 B?Ci is defined as an output place of the 
A1B1C1 —> A?B? transition. It thus guarantees that the msg? communicative act can be 
sent (represented by the A' 1 B?Ci —¥ A?C? transition) only upon completion of the msgi 
transmission (the A1B1C1 — > A?B? transition). 

The last interaction we present is the synchronized messages interaction, shown in Fig- 
ure [TSJ- a. Here, agents simultaneously receives msgi from agenti and msg? from agent?. 
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In AUML, this constraint is annotated by merging the two communicative act arrows into 
a horizontal bar with a single output arrow. 
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Figure 19: Synchronized messages interaction. 

Figure \l9l b illustrates the CP- net implementation of synchronized messages interaction. 
As in previous examples, we define the A\C\, B\Ci, msgi, msgi and A2B2C2 places. We 
additionally define two intermediate agent places, A2C1 and B2C". The A2C1 place repre- 
sents a joint interaction state where agent\ has sent msgi to agents (A2), and agents, has 
received it, however agents is a l so waiting to receive msg2 (C{). The B2C" place represents 
a joint interaction state in which agent2 has sent msg2 to agents (B2), and agents has 
received it, however agents is also waiting to receive msgi (C"). These places guarantee 
that the interaction does not transition to the A2B2C2 state until both msgi an d msg2 have 
been received by agents- 



Appendix C. An Example of a Complex Interaction Protocol 

We present an example of a complex 3-agent conversation protocol, which was manually 
converted to a CP-net representation using the building blocks in this paper. The con- 
versation protocol addressed here is the FIPA Brokering Interaction Protocol (?). This 
interaction protocol incorporates many advanced conversation features of our representa- 
tion such as nesting, communicative act sequence expression, message guards and etc. Its 
AUML representation is shown in Figure [201 

The Initiator agent begins the interaction by sending a proxy message to the Broker 
agent. The proxy communicative act contains the requested proxied- communicative- act as 
part of its argument list. The Broker agent processes the request and responds with either an 
agree or a refuse message. Communication of a refuse message terminates the interaction. 
If the Broker agent has agreed to function as a proxy, it then locates the agents matching 
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Figure 20: FIPA Brokering Interaction Protocol - AUML representation. 

the Initiator request. If no such agent can be found, the Broker agent communicates 
a failure-no-match message and the interaction terminates. Otherwise, the Broker agent 
begins m interactions with the matching agents. For each such agent, the Broker informs the 
Initiator, sending either an inform- done-proxy or a failure-proxy communicative act. The 
failure-proxy communicative act terminates the sub-protocol interaction with the matching 
agent in question. The inform- done-proxy message continues the interaction. As the sub- 
protocol progresses, the Broker forwards the received responses to the Initiator agent using 
the reply-message-sub-protocol communicative acts. However, there can be other failures 
that are not explicitly returned from the sub-protocol interaction (e.g., if the agent executing 
the sub-protocol has failed). In case the Broker agent detects such a failure, it communicates 
a failure-brokering message, which terminates the sub-protocol interaction. 

A CP-net representation of the FIPA Brokering Interaction Protocol is shown in Fig- 
ure [2TJ The Brokering Interaction Protocol starts from I\B\ place. The I\B\ place rep- 
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resents a joint interaction state where Initiator is ready to send a proxy communicative 
act (ii) and Broker is waiting to receive it (-Bi). The proxy communicative act causes the 
interacting agents to transition to I2B2. This place denotes an interaction state in which 
Initiator has already sent a proxy message to Broker (I2) and Broker has received it (i?2)- 
The Broker agent can send, as a response, either a refuse or an agree communicative act. 
This CP-net component is implemented using the XOR-decision building block presented 
in Section [3j The refuse message causes the agents to transition to /3-B3 place and thus 
terminate the interaction. This place corresponds to Broker sending a refuse message 
and terminating (-B3), while Initiator receiving the message and terminating (Is). On the 
other hand, the agree communicative act causes the agents to transition to place, 
which represents a joint interaction state in which the Broker has sent an agree message 
to Initiator (and is now trying to locate the receivers of the proxied message), while the 
Initiator received the agree message. 

The Broker agent's search for suitable receivers may result in two alternatives. First, 
in case no matching agents are found, the interaction terminates in the /5-B5 agent place. 
This joint interaction place corresponds to an interaction state where Broker has sent the 
failure-no-match communicative act (-B5), and Initiator has received the message and ter- 
minated (I5). The second alternative is that suitable agents have been found. Then, Broker 
starts sending proxied- communicative- act messages to these agents on the established list 
of designated receivers, i.e. TARGET-LIST. The first such proxied- communicative- act mes- 
sage causes the interacting agents to transition to I$BqP\ place. The I^BqPi place denotes 
a joint interaction state of three agents: Initiator, Broker and Participant (the receiver). 
The Initiator individual state remains unchanged (I4) since the proxied- communicative- act 
message starts an interaction between Broker and Participant. The Broker individual 
state (Bq) denotes that designated agents have been found and the proxied- communicative- 
act messages are ready to be sent, while Participant is waiting to receive the interaction 
initiating communicative act (-Pi). The proxied- communicative- act message place is also 
connected as an output place of the transition. This message place is used as part of a 
CP-net XOR-decision structure, which enables the Broker agent to send either a failure-no- 
match or a proxied- communicative- act, respectively. Thus, the token denoting the proxied- 
communicative-act message, must not be consumed by the transition. 

Thus, multiple proxied- communicative- act messages are sent to all Participants. This 
is implemented similarly to the broadcast sequence expression implementation (Section [4|). 
Furthermore, the proxied- communicative- act type is verified against the type of the requested 
proxied communicative act, which is obtained from the original proxy message content. 
We use the Proxied- Communicative- Act-Type message type place to implement this CP- 
net component similarly to Figure [8j Each proxied- communicative- act message causes the 
interacting agents to transition to both the I^B-jP\ and the BqP\ places. 

The BqPi place corresponds to interaction between the Broker and the Participant 
agents. It represents a joint interaction state in which Broker is ready to send a proxied- 
communicative-act message to Participant (Bq), and Participant is waiting for the message 
(Pi). In fact, the -E^Pi place initiates the nested interaction protocol that results in -B10P3 
place. The -B10P3 place represents a joint interaction state where Participant has sent the 
reply-message communicative act and terminated (P3), and Broker has received the message 
(i?io). In our example, we have chosen the FIPA Query Interaction Protocol (?) (Figures [7J- 
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Figure 21: FIPA Brokering Interaction Protocol - CP-net representation. 

[8J as the interaction sub-protocol. The CP-net component, implementing the nested interac- 
tion sub-protocol, is modeled using the principles described in Section [5j Consequently, the 
interaction sub-protocol is concealed using the Query-Sub-Protocol substitution transition. 
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The BqPi, proxied- communicative- act and P10P3 places determine substitution transition 
socket nodes. These socket nodes are assigned to the CP-net port nodes in Figure [8] as fol- 
lows. The BqPi and proxied- communicative- act places are assigned to the I\P\ and query 
input port nodes, while the P10P3 place is assigned to the /3P3, I5P5 and I^Pq output port 
nodes. 

We now turn to the J4P7P1 place. In contrast to the BqP\ place, this place corresponds to 
the main interaction protocol. The I4P7P1 place represents a joint interaction state in which 
Initiator is waiting for Broker to respond (I4), Broker is ready to send an appropriate re- 
sponse communicative act (By), and to the best of the Initiator's knowledge the interaction 
with Participant has not yet begun (.Pi). The Broker agent can send one of two messages, 
either a failure-proxy or an inform- done-proxy, depending on whether it has succeeded to 
send the proxied- communicative- act message to Participant. The failure-proxy message 
causes the agents to terminate the interaction with corresponding Participant agent and to 
transition to IqB$Pi place. This place denotes a joint interaction state in which Initiator 
has received a failure-proxy communicative act and terminated (Iq), Broker has sent the 
failure-proxy message and terminated as well (Bg) and the interaction with the Participant 
agent has never started (Pi). On the other hand, the inform- done-proxy causes the agents to 
transition to /7P9P2 place. The /7P9P2 place represents an interaction state where Broker 
has sent the inform- done-proxy message (P9), Initiator has received it (I7), and Participant 
has begun the interaction with the Broker agent (P2). Again, this is represented using the 
XOR-decision building block. 

Finally, the Broker agent can either send a reply-message-sub-protocol or a failure- 
brokering communicative act. The failure-brokering message causes the interacting agents 
to transition to IgBnP2 place. This place indicates that Broker has sent a failure-brokering 
message and terminated (P11), Initiator has received the message and terminated (Is), and 
Participant has terminated during the interaction with the Broker agent (P2). The reply- 
message-sub-protocol communicative act causes the agents to transition to /9P12P3 place. 
The /9P12P3 place indicates that Broker has sent a reply-message-sub-protocol message and 
terminated (P12), Initiator has received the message and terminated (I9), and Participant 
has successfully completed the nested sub-protocol with the Broker agent and terminated as 
well (P3). Thus, the P10P3 place, denoting a successful completion of the nested sub-protocol, 
is also the corresponding transition input place. 
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Abstract 

Logical hidden Markov models (LOHMMs) upgrade traditional hidden Markov models 
to deal with sequences of structured symbols in the form of logical atoms, rather than flat 
characters. 

This note formally introduces LOHMMs and presents solutions to the three central in- 
ference problems for LOHMMs: evaluation, most likely hidden state sequence and param- 
eter estimation. The resulting representation and algorithms are experimentally evaluated 
on problems from the domain of bioinformatics. 

1. Introduction 

Hidden Markov models (HMMs) (Rabiner & Juang, 1986) are extremely popular for an- 
alyzing sequential data. Application areas include computational biology, user modelling, 
speech recognition, empirical natural language processing, and robotics. Despite their suc- 
cesses, HMMs have a major weakness: they handle only sequences of flat, i.e., unstruc- 
tured symbols. Yet, in many applications the symbols occurring in sequences are struc- 
tured. Consider, e.g., sequences of UNIX commands, which may have parameters such 
as emacs lohmms.tex, Is, latex lohmms.tex, . . .Thus, commands are essentially structured. 
Tasks that have been considered for UNIX command sequences include the prediction of 
the next command in the sequence (Davison & Hirsh, 1998), the classification of a command 
sequence in a user category (Korvemaker & Greiner, 2000; Jacobs & Blockeel, 2001), and 
anomaly detection (Lane, 1999). Traditional HMMs cannot easily deal with this type of 
structured sequences. Indeed, applying HMMs requires either 1) ignoring the structure of 
the commands (i.e., the parameters), or 2) taking all possible parameters explicitly into 
account. The former approach results in a serious information loss; the latter leads to a 
combinatorial explosion in the number of symbols and parameters of the HMM and as a 
consequence inhibits generalization. 

The above sketched problem with HMMs is akin to the problem of dealing with struc- 
tured examples in traditional machine learning algorithms as studied in the fields of in- 
ductive logic programming (Muggleton & De Raedt, 1994) and multi-relational learn- 
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ing (Dzeroski <fe Lavrac, 2001). In this paper, we propose an (inductive) logic programming 
framework, Logical HMMs (LOHMMs), that upgrades HMMs to deal with structure. The 
key idea underlying LOHMMs is to employ logical atoms as structured (output and state) 
symbols. Using logical atoms, the above UNIX command sequence can be represented 
as emacs(lohmms.tex), Is, latex(lohmms.tex), . . . There are two important motivations for 
using logical atoms at the symbol level. First, variables in the atoms allow one to make 
abstraction of specific symbols. E.g., the logical atom emacs(X, tex) represents all files X 
that a LTpX user tex could edit using emacs. Second, unification allows one to share in- 
formation among states. E.g., the sequence emacs(X, tex), latex(X, tex) denotes that the 
same file is used as an argument for both Emacs and LTpX. 

The paper is organized as follows. After reviewing the logical preliminaries, we introduce 
LOHMMs and define their semantics in Section 3; in Section 4, we upgrade the basic 
HMM inference algorithms for use in LOHMMs; we investigate the benefits of LOHMMs in 
Section 5: we show that LOHMMs are strictly more expressive than HMMs, that they can 
be — by design — an order of magnitude smaller than their corresponding propositional 
instantiations, and that unification can yield models, which better fit the data. In Section 6, 
we empirically investigate the benefits of LOHMMs on real world data. Before concluding, 
we discuss related work in Section 7. Proofs of all theorems can be found in the Appendix. 

2. Logical Preliminaries 

A first-order alphabet £ is a set of relation symbols r with arity m > 0, written r/m, and a 
set of functor symbols f with arity n > 0, written f/n. If n = then f is called a constant, 
if m = then p is called a propositional variable. (We assume that at least one constant 
is given.) An atom r(t 1 , . . . ,t n ) is a relation symbol r followed by a bracketed n-tuple of 
terms tj. A term t is a variable V or a functor symbol f (t-,, . . . , t k ) immediately followed by 
a bracketed fc-tuple of terms tj. Variables will be written in upper-case, and constant, func- 
tor and predicate symbols lower-case. The symbol _ will denote anonymous variables which 
are read and treated as distinct, new variables each time they are encountered. An iterative 
clause is a formula of the form H 4— B where H (called head) and B (called body) are logical 
atoms. A substitution 6 = {Vi/ti, . . . , V n /t n }, e.g. {X/tex}, is an assignment of terms tj 
to variables Vj. Applying a substitution a to a term, atom or clause e yields the instanti- 
ated term, atom, or clause ea where all occurrences of the variables Vj are simultaneously 
replaced by the term tj, e.g. ls(X) <— emacs(F, X){X/tex} yields ls(tex) <— emacs(F, tex). 
A substitution a is called a unifier for a finite set S of atoms if Sa is singleton. A unifier 9 
for S is called a most general unifier (MGU) for S if, for each unifier a of S, there exists a 
substitution 7 such that a = #7. A term, atom or clause E is called ground when it contains 
no variables, i.e., vars(E) = 0. The Herbrand base of S, denoted as hbs, is the set of all 
ground atoms constructed with the predicate and functor symbols in S. The set Gs(A) of 
an atom A consists of all ground atoms k9 that belong to hb^. 

3. Logical Hidden Markov Models 

The logical component of a traditional HMM corresponds to a Mealy machine (Hopcroft 
&; Ullman, 1979), i.e., a finite state machine where the output symbols are associated with 
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transitions. This is essentially a propositional representation because the symbols used to 
represent states and output symbols are flat, i.e. not structured. The key idea underlying 
LOHMMs is to replace these flat symbols by abstract symbols. An abstract symbol A is - 
by definition — a logical atom. It is abstract in that it represents the set of all ground, i.e., 
variable- free atoms of A over the alphabet S, denoted by Gs(A). Ground atoms then play 
the role of the traditional symbols used in a HMMs. 

Example 1 Consider the alphabet T,\ which has as constant symbols tex, dvi, hmml, 
and lohmml, and as relation symbols emacs/2, ls/1, xdvi/1, latex/2. Then the atom 
emacs(File, tex) represents the set {emacs(hmml, tex), emacs(lohmml, tex)}. We assume 
that the alphabet is typed to avoid useless instantiations such as emacs(tex, tex)). 

The use of atoms instead of flat symbols allows us to analyze logical and structured sequences 
such as emacs(hmml, tex), latex(hmml, tex), xdvi(hmml, dvi). 

Definition 1 Abstract transition are expressions of the form p : H B where p € [0, 1], 
and E, B and are atoms. All variables are implicitly assumed to be universally quantified, 
i.e., the scope of variables is a single abstract transition. 

The atoms H and B represent abstract states and represents an abstract output symbol. 
The semantics of an abstract transition p : H B is that if one is in one of the states in 
Gs(B), say B# B j one will go with probability p to one of the states in Gr£(H# B ), say H# b #hj 
while emitting a symbol in Gy:(O0b&e), say D^ b ^h^o- 

Example 2 Consider c = 0.8 : xdvi(File, dvi) < latex(File, tex). In general 

H, B and do not have to share the same predicate. This is only due to the na- 
ture of our running example. Assume now that we are in state latex(hmml, tex), i.e. 
# B = {File/hmml}. Then c specifies that there is a probability of 0.8 that the next state 
will be in Gs x (xdvi(hmml, dvi)) = {xdvi (hmml, dvi)} ( i.e., the probability is 0.8 that the 
next state will be xdvi (hmml, dvi) ), and that one of the symbols in G-£ t (latex(hmml)) = 
{latex(hmml)} ( i.e., latex(hmml)j will be emitted. Abstract states might also be more 
complex such as latex(f ile(FileStem, FileExtension), User) 

The above example was simple because #h and 9q were both empty. The situation be- 
comes more complicated when these substitutions are not empty. Then, the resulting 
state and output symbol sets are not necessarily singletons. Indeed, for the transi- 

. . latex(File) , . 

tion 0.8 : emacs(File , dvi) < latex(File, tex) the resulting state set would be 

G r s 1 (emacs(File ', dvi)) = {emacs(hmml, tex), emacs(lohmml, tex)}. Thus the transition 
is non-deterministic because there are two possible resulting states. We therefore need a 
mechanism to assign probabilities to these possible alternatives. 

Definition 2 The selection distribution \i specifies for each abstract state and observation 
symbol A over the alphabet £ a distribution /i(- | A) over Grs(A). 

To continue our example, let pi(emacs (hmml, tex) | emacs(File', tex)) = 0.4 and 
/i(emacs(lohmml, tex) | emacs(File', tex)) = 0.6. Then there would be a probabil- 
ity of 0.4 x 0.8 = 0.32 that the next state is emacs(hmml, tex) and of 0.48 that it is 
emacs(lohmml, tex). 
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Taking \i into account, the meaning of an abstract transition p : H < — B can be sum- 
marized as follows. Let B0 B G G S (B), H0 B H G G E (H0 B ) and O0 b h 0q ^ G , E (O0 B H ). Then the 
model makes a transition from state B0 B to H0 B H and emits symbol O0 B H 0o with probability 

p • / u(H0 B H I H0 B ) • MO0 B H O I Wn). (1) 

To represent /U, any probabilistic representation can - in principle - be used, e.g. a Bayesian 
network or a Markov chain. Throughout the remainder of the present paper, however, 
we will use a naive Bayes approach. More precisely, we associate to each argument of a 
relation r/m a finite domain D^ m of constants and a probability distribution P^ m over 

D^ m . Let vars(A) = {Vi, . . . , V/} be the variables occurring in an atom A over r/m, and 
let a = {Vi/si, . . . V^/s;} be a substitution grounding A. Each Vj is then considered a 
random variable over the domain ^rgO? ) °^ ^ e ar g umen t arg(Vj) it appears first in. Then, 
fi(Aa | A) = rij=i -^argjv )( s i) - •^•S" Mema-C^l™ 111 !! tex) | emacs(F,E)), is computed as the 

product of P^ macs ^ 2 (hmml) and P2 macs ^ 2 (tex). 

Thus far the semantics of a single abstract transition has been defined. A LOHMM 
usually consists of multiple abstract transitions and this creates a further complication. 

. . emacs(File) . . 

Example 3 Consider 0.8 : latex(File, tex) < emacs(File, tex) and 

i emacs(File) , . 

0.4 : dvi(File) < emacs(File, User). These two abstract transitions make 

conflicting statements about the state resulting from emacs(hmml, tex). Indeed, according 
to the first transition, the probability is 0.8 that the resulting state is latex(hmml, tex) and 
according to the second one it assigns 0.4 to xdvi(hmml). 

There are essentially two ways to deal with this situation. On the one hand, one might want 
to combine and normalize the two transitions and assign a probability of | respectively | . 
On the other hand, one might want to have only one rule firing. In this paper, we chose the 
latter option because it allows us to consider transitions more independently, it simplifies 
learning, and it yields locally interpretable models. We employ the subsumption (or gen- 
erality) relation among the B-parts of the two abstract transitions. Indeed, the B-part of 
the first transition Bi = emacs(File, tex) is more specific than that of the second transi- 
tion B2 = emacs(File, User) because there exists a substitution = {User/tex} such that 
B20 = Bi, i.e., B2 subsumes Bi. Therefore Ge 1 (Bi) C Ge 1 (B2) and the first transition can 
be regarded as more informative than the second one. It should therefore be preferred over 
the second one when starting from emacs(hmml, tex). We will also say that the first tran- 
sition is more specific than the second one. Remark that this generality relation imposes a 
partial order on the set of all transitions. These considerations lead to the strategy of only 
considering the maximally specific transitions that apply to a state in order to determine 
the successor states. This implements a kind of exception handling or default reasoning 
and is akin to Katz's (1987) back-off ra-gram models. In back-off n-gram models, the most 
detailed model that is deemed to provide sufficiently reliable information about the current 
context is used. That is, if one encounters an n-gram that is not sufficiently reliable, then 
back-off to use an (n — l)-gram; if that is not reliable either then back-off to level n — 2, etc. 

The conflict resolution strategy will work properly provided that the bodies of all max- 
imally specific transitions (matching a given state) represent the same abstract state. This 
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latex(F) : 0.6 



Figure 1: A logical hidden Markov model. 



can be enforced by requiring the generality relation over the B-parts to be closed under the 
greatest lower bound (gib) for each predicate, i.e., for each pair Bi,B2 of bodies, such that 
9 = mgu(Bi,B2) exists, there is another body B (called lower bound) which subsumes &\6 
(therefore also B2#) and is subsumed by Bi,B2, and if there is any other lower bound then 
it is subsumed by B. E.g., if the body of the second abstract transition in our example is 
emacs(hmml, User) then the set of abstract transitions would not be closed under gib. 

Finally, in order to specify a prior distribution over states, we assume a finite set T of 
clauses of the form p : H start using a distinguished start symbol such that p is the 
probability of the LOHMM to start in a state of Gs(H). 

By now we are able to formally define logical hidden Markov models. 

Definition 3 A logical hidden Markov model (LOHMM) is a tuple (£,//, A, Y) where £ is 
a logical alphabet, fj, a selection probability over S, A is a set of abstract transitions, and Y 
is a set of abstract transitions encoding a prior distribution. Let B be the set of all atoms 
that occur as body parts of transitions in A . We assume B to be closed under gib and require 

VB G B : V c p = 1.0 (2) 

and that the probabilities p of clauses in Y sum up to 1.0 . 

HMMs are a special cases of LOHMMs in which £ contains only relation symbols of arity 
zero and the selection probability is irrelevant. Thus, LOHMMs directly generalize HMMs. 

LOHMMs can also be represented graphically. Figure 1 contains an example. The under- 
lying language £2 consists of Si together with the constant symbol other which denotes a 
user that does not employ DTj]X. In this graphical notation, nodes represent abstract states 
and black tipped arrows denote abstract transitions. White tipped arrows are used to repre- 
sent meta knowledge. More precisely, white tipped, dashed arrows represent the generality or 
subsumption ordering between abstract states. If we follow a transition to an abstract state 
with an outgoing white tipped, dotted arrow then this dotted arrow will always be followed. 
Dotted arrows are needed because the same abstract state can occur under different cir- 

„ . . , , latex(File) . . 

cumstances. Consider the transition p : latex(File , User J < latex(File, User). 
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start^ em(F,U) - / lem(f i , t)-- -em(F, t) la(F,t) -^-B-la(f i, t) — , 

abstract state state abstract state abstract state state 0.6 

0.4 0.7 la ( f l) 

Is , . A* ... em(f 2 ) U 

- ls(t)— ls(U') .. em(f 2 ,o)-*— em(F,U) - em(F',U) - J 

state abstract state state abstract state abstract state 

Figure 2: Generating the observation sequence emacs(hmml), latex(hmml), 
emacs(lohmml), Is by the LOHMM in Figure 1. The command emacs is 
abbreviated by em, f i denotes the filename hmml, f 2 represents lohmml, t denotes 
a tex user, and o some other user. White tipped solid arrows indicate selections. 



em(f 1) 



Even though the atoms in the head and body of the transition are syntactically different they 
represent the same abstract state. To accurately represent the meaning of this transition we 
cannot use a black tipped arrow from latex(File, User) to itself, because this would actu- 

1 \ latex(File) . . 

ally represent the abstract transition p : latex(File, User) < latex(File, User). 

Furthermore, the graphical representation clarifies that LOHMMs are generative mod- 
els. Let us explain how the model in Figure 1 would generate the observation sequence 
emacs(hmml), latex(hmml), emacs(lohmml), Is (cf. Figure 2). It chooses an initial ab- 
stract state, say emacs(F,U). Since both variables F and U are uninstantiated, the model 
samples the state emacs(hmml, tex) from Gs 2 using [i. As indicated by the dashed ar- 
row, emacs(F,tex) is more specific than emacs(F,U). Moreover, emacs(hmml, tex) matches 
emacs(F, tex). Thus, the model enters emacs (F, tex). Since the value of F was already 
instantiated in the previous abstract state, emacs(hmml, tex) is sampled with probability 
1.0. Now, the model goes over to latex(F, tex), emitting emacs(hmml) because the abstract 
observation emacs (F) is already fully instantiated. Again, since F was already instantiated, 
latex(hmml, tex) is sampled with probability 1.0. Next, we move on to emacs(F',U), emit- 
ting latex(hmml). Variables F' and U in emacs(F',U) were not yet bound; so, values, say 
lohmml and others, are sampled from \i. The dotted arrow brings us back to emacs(F,U). 
Because variables are implicitly universally quantified in abstract transitions, the scope of 
variables is restricted to single abstract transitions. In turn, F is treated as a distinct, 
new variable, and is automatically unified with F', which is bound to lohmml. In contrast, 
variable U is already instantiated. Emitting emacs (lohmml), the model makes a transition 
to ls(U'). Assume that it samples tex for U'. Then, it remains in ls(U') with probability 
0.4 . Considering all possible samples, allows one to prove the following theorem. 

Theorem 1 (Semantics) A logical hidden Markov model over a language E defines a 
discrete time stochastic process, i.e., a sequence of random variables (X t )t=i,2,..., where the 
domain of X t is hb(E) xhb(E). The induced probability measure over the Cartesian product 
&) t hb(E) x hb(E) exists and is unique for each t > and in the limit t — > 00. 

Before concluding this section, let us address some design choices underlying LOHMMs. 

First, LOHMMs have been introduced as Mealy machines, i.e., output symbols are 
associated with transitions. Mealy machines fit our logical setting quite intuitively as they 
directly encode the conditional probability P(0,S'|S) of making a transition from S to S' 
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emitting an observation 0. Logical hidden Markov models define this distribution as 
P(0,S'|S) = V a , p ■ MS' | Hcjb) • fi(0 | OVbOh) 

'^p-.Hi B 

0' 

where the sum runs over all abstract transitions H < — B such that B is most specific for S. 
Observations correspond to (partially) observed proof steps and, hence, provide information 
shared among heads and bodies of abstract transitions. In contrast, HMMs are usually 
introduced as Moore machines. Here, output symbols are associated with states implicitly 
assuming D and S' to be independent. Thus, P(0,S' | S) factorizes into P(0 | S) • P(S' | S). 
This makes it more difficult to observe information shared among heads and bodies. In 
turn, Moore-LOHMMs are less intuitive and harder to understand. For a more detailed 
discussion of the issue, we refer to Appendix B where we essentially show that - as in the 
prepositional case - Mealy- and Moore-LOHMMs are equivalent. 

Second, the naive Bayes approach for the selection distribution reduces the model com- 
plexity at the expense of a lower expressivity: functors are neglected and variables are 
treated independently. Adapting more expressive approaches is an interesting future line of 
research. For instance, Bayesian networks allow one to represent factorial HMMs (Ghahra- 
mani & Jordan, 1997). Factorial HMMs can be viewed as LOHMMs, where the hidden 
states are summarized by a 2 • /c-ary abstract state. The first k arguments encode the k 
state variables, and the last k arguments serve as a memory of the previous joint state. \i 
of the i-th argument is conditioned on the i + /s-th argument. Markov chains allow one to 
sample compound terms of finite depth such as s(s(s(0))) and to model e.g. misspelled 
filenames. This is akin to generalized HMMs (Kulp, Haussler, Reese, & Eeckman, 1996), in 
which each node may output a finite sequence of symbols rather than a single symbol. 

Finally, LOHMMs - as introduced in the present paper - specify a probability distri- 
bution over all sequences of a given length. Reconsider the LOHMM in Figure 1. Al- 
ready the probabilities of all observation sequences of length 1, i.e., Is, emacs(hmml), and 
emacs(lohmml)) sum up to 1. More precisely, for each t > it holds that ^ P(X\ = 

xi, . . . , Xt = xt) = 1.0 . In order to model a distribution over sequences of variable length, 
i.e., ^i>o x t P(Xi = x i, ■ ■ ■ ,X t = x t ) = 1.0 we may add a distinguished end state. 

The end state is absorbing in that whenever the model makes a transition into this state, 
it terminates the observation sequence generated. 

4. Three Inference Problems for LOHMMs 

As for HMMs, three inference problems are of interest. Let M be a LOHMM and let 
= 0i, O2, • • • , 0t, T > 0, be a finite sequence of ground observations: 

(1) Evaluation: Determine the probability P(0 \ M) that sequence was generated by 

the model M. 

(2) Most likely state sequence: Determine the hidden state sequence S* that has most 

likely produced the observation sequence 0, i.e. S* = argmaxsi- > (S | 0,M) . 

(3) Parameter estimation: Given a set = {Oi, . . . , 0^} of observation sequences, de- 

termine the most likely parameters A* for the abstract transitions and the selection 
distribution of M, i.e. A* = argmax^P(0 | A) . 
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transition transition transition 
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em(F,U 




X ,„ em(F',o1 ■ ,., , 
em(fl,t) ' em(f2,t^ 

latex(fl,t)-& latex(fl.t) ' latex(f'2,t) 



So 



Si 



s 2 



Figure 3: Trellis induced by the LOHMM in Figure 1. The sets of reachable states at time 
0, 1, . . . are denoted by So, Si, . . . In contrast with HMMs, there is an additional 
layer where the states are sampled from abstract states. 



We will now address each of these problems in turn by upgrading the existing solutions for 
HMMs. This will be realized by computing a grounded trellis as in Figure 3. The possible 
ground successor states of any given state are computed by first selecting the applicable 
abstract transitions and then applying the selection probabilities (while taking into account 
the substitutions) to ground the resulting states. This two-step factorization is coalesced 
into one step for HMMs. 

To evaluate 0, consider the probability of the partial observation sequence 0i, O2, • • ■ , 0t 
and (ground) state S at time t, < t < T, given the model M = (E, \x, A, T) 

a t (S) :=P(0i J 2j ... ) t ,g t = S I M) 

where qt = S denotes that the system is in state S at time t. As for HMMs, at(S) can be com- 
puted using a dynamic programming approach. For t = 0, we set «o(S) = P(qo = S | M) , 
i.e., ao(S) is the probability of starting in state S and, for t > 0, we compute at(S) based 
on a^i(S'): 



9: 

10: 

11: 



So := {start} 

for t = 1,2,... ,T do 

S t = 

for each S G St-i do 



'* initialize the set of reachable states*/ 

initialize the set of reachable states at clock t */ 



return 



foreach maximally specific p:H^-B6 AuT s.t. cr B = mgu(S, B) exists do 




foreach S' = Ea B a E S G^(Ea B ) s.t. 


0t-i unifies with OctbCTh do 








if S' S t then 
S t := S t U {S'} 
OtiS') := 0.0 










at(S') :=at(S') + at_i(S) -p ■ 


fi(S' | H<T B ) • /i(0 t _i | 0o- B a H ) 
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where we assume for the sake of simplicity = start for each abstract transition p : H 
start G T. Furthermore, the boxed parts specify all the differences to the HMM formula: 
unification and [i are taken into account. 

Clearly, as for HMMs P(0 | M) = X^sgSt holds. The computational complexity 

of this forward procedure is 0(T ■ s ■ (|B| + o ■ g)) = 0(T ■ s 2 ) where s = Tn.ax.t=i,2,...,T |<5t| , 
o is the maximal number of outgoing abstract transitions with regard to an abstract state, 
and g is the maximal number of ground instances of an abstract state. In a completely 
analogous manner, one can devise a backward procedure to compute 



/3t(S) = P(0 m ,0 t - 



T I q t = S,M) 



This will be useful for solving Problem (3). 

Having a forward procedure, it is straightforward to adapt the Viterbi algorithm as a 
solution to Problem (2), i.e., for computing the most likely state sequence. Let 5t(S) 
denote the highest probability along a single path at time t which accounts for the first t 
observations and ends in state S, i.e., 



5t(S)= max P(S ,Si, 

So,Si,...,Si_i 



l S t _i,S t = S,Oi,...,O t _ L |M) . 



The procedure for finding the most likely state sequence basically follows the forward pro- 
cedure. Instead of summing over all ground transition probabilities in line 10, we maximize 
over them. More precisely, we proceed as follows: 



10 
11 

12 
13 



'* initialize the set of reachable states*/ 
/* initialize the set of reachable states at clock t */ 



Sq := {start} 

for t = 1,2, ... ,T do 
S t = 

for each S G St-i do 

foreach maximally specific p:HABG AuT s.t. <7b = mgu(S, B) exists do 
foreach S' = HctbCTh € Ge(Ho"b) s.t. D t _i unifies with 0<7 B o"h do 
if S' S t then 
S t := S t U {S'} 
<$t(S,S') := 0.0 

<5t(S,S'):=<St(S,S') + <5t-i(S)-p • fi{S' | Ha B ) • /x(0 t _i | 0a B a n ) 
foreach S' G S t do 

dt(S') =max se5i _ 1 5i(S,S / ) 
ipt(S') = argmaxses,,, ^ t (S, S') 



Here, 5t(S,S') stores the probability of making a transition from S to S' and ^t(S') (with 
tpi(S) = start for all states S) keeps track of the state maximizing the probability along 
a single path at time t which accounts for the first t observations and ends in state S'. The 
most likely hidden state sequence S* can now be computed as 



and S* t 



arg max <5t+i(S) 
SeSr+i 



t+i. 



for t = T,T . 



One can also consider problem (2) on a more abstract level. Instead of considering all 
contributions of different abstract transitions T to a single ground transition from state S 
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to state S' in line 10, one might also consider the most likely abstract transition only. This 
is realized by replacing line 10 in the forward procedure with 

at(S') := max(a t (S'),Q!t-i(S) -p- /i(S' | H<t b ) ■ K°t-i I 0<t b <7h)) • 

This solves the problem of finding the (2') most likely state and abstract transition 
sequence: 

Determine the sequence of states and abstract transitions GT* = 
5*0, To, Si, Ti, S2, . . . , St, Tt, St+i where there exists substitutions Oi with Si+i 4— 
Si = 1i9i that has most likely produced the observation sequence 0, i.e. 
GT* = argmax GT P(GT | 0,M) . 

Thus, logical hidden Markov models also pose new types of inference problems. 

For parameter estimation, we have to estimate the maximum likelihood transition 
probabilities and selection distributions. To estimate the former, we upgrade the well-known 
Baum- Welch algorithm (Baum, 1972) for estimating the maximum likelihood parameters 
of HMMs and probabilistic context-free grammars. 

For HMMs, the Baum- Welch algorithm computes the improved estimate p of the tran- 
sition probability of some (ground) transition T = p : H 4-^- B by taking the ratio 

_ {(T) . 

E .JL. «< T '> 

E'i BgAuT 

between the expected number £(T) of times of making the transitions T at any time given 
the model M and an observation sequence 0, and the total number of times a transitions 
is made from B at any time given M and 0. 

Basically the same applies when T is an abstract transition. However, we have to be 
a little bit more careful because we have no direct access to £(T). Let £t(gcl, T) be the 

GO 

probability of following the abstract transition T via its ground instance gel = p : GH < GB 

at time t, i.e., 



at(GB)-p-ft+i(GH) 



6(gcl,T) = p(0|M) ' /i(GH 1 H(Jb) ' 1 0CTbCTh) ' 



where ctb ; o"h ar e as in the forward procedure (see above) and P(0 | M) is the probability 
that the model generated the sequence 0. Again, the boxed terms constitute the main 
difference to the corresponding HMM formula. In order to apply Equation (3) to compute 
improved estimates of probabilities associated with abstract transitions, we set 



T T 



?(t) = E&(t) = EE^w t ) 



t=i t=i 



where the inner sum runs over all ground instances of T. 

This leads to the following re-estimation method, where we assume that the sets Si of 
reachable states are reused from the computations of the a- and /3-values: 
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1 

2 
3 
4 
5 
6 



/* initialization of expected counts */ 



foreach T € A U T do 



£(T) := m /* or if not using pseudocounts */ 
/* compute expected counts */ 



for t = 0, 1,... ,T do 



foreach S G St do 



7: 



foreach max. specific T = p :E i — B G A U T s.t. cr B = mgu(S, B) exists do 



8 



foreach S' = H<t b <7 H G Gs(Hcr B ) s.t. S' G St+i A mgu(0f, OctbCh) exists do 



9: 



f (T) := £(T) + a t (S) ■ P ■ (3 t+1 (S>) / P(0 \ M)- /x(S' | Ha B ) • p(0 t _i | 0<7 B <t h ) 



Here, equation (4) can be found in line 9. In line 3, we set pseudocounts as small sample- 
size regularizers. Other methods to avoid a biased underestimate of probabilities and even 
zero probabilities such as m-estimates (see e.g., Mitchell, 1997) can be easily adapted. 

To estimate the selection probabilities, recall that fx follows a naive Bayes scheme. There- 
fore, the estimated probability for a domain element d G D for some domain D is the ratio 
between the number of times d is selected and the number of times any d! G D is selected. 
The procedure for computing the ^-values can thus be reused. 

Altogether, the Baum- Welch algorithm works as follows: While not converged, (1) es- 
timate the abstract transition probabilities, and (2) the selection probabilities. Since it is 
an instance of the EM algorithm, it increases the likelihood of the data with every update, 
and according to McLachlan and Krishnan (1997), it is guaranteed to reach a stationary 
point. All standard techniques to overcome limitations of EM algorithms are applicable. 
The computational complexity (per iteration) is 0(k ■ (a + d)) = 0(k ■ T ■ s 2 + k ■ d) where 
k is the number of sequences, a is the complexity of computing the a- values (see above), 
and d is the sum over the sizes of domains associated to predicates. Recently, Kersting 
and Raiko (2005) combined the Baum- Welch algorithm with structure search for model 
selection of logical hidden Markov models using inductive logic programming (Muggleton 
&: De Raedt, 1994) refinement operators. The refinement operators account for different 
abstraction levels which have to be explored. 

5. Advantages of LOHMMs 

In this section, we will investigate the benefits of LOHMMs: (1) LOHMMs are strictly 
more expressive than HMMs, and (2), using abstraction, logical variables and unification 
can be beneficial. More specifically, with (2), we will show that 

(Bl) LOHMMs can be — by design — smaller than their propositional instantiations, and 
(B2) unification can yield better log-likelihood estimates. 

5.1 On the Expressivity of LOHMMs 

Whereas HMMs specify probability distributions over regular languages, LOHMMs specify 
probability distributions over more expressive languages. 
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Theorem 2 For any (consistent) probabilistic context-free grammar (PCFG) G for some 
language C there exists a LOHMM M s.t. Pg{w) = Pm(w) for all w £ C 

The proof (see Appendix C) makes use of abstract states of unbounded 'depth'. More 
precisely, functors are used to implement a stack. Without functors, LOHMMs cannot 
encode PCFGs and, because the Herbrand base is finite, it can be proven that there always 
exists an equivalent HMM. 

Furthermore, if functors are allowed, LOHMMs are strictly more expressive than PCFGs. 
They can specify probability distributions over some languages that are context-sensitive: 



1.0 


stack(s(0), s(0)) 


<— 


start 






0.8 


stack(s(X),s(X)) 




stack(X,X) 






0.2 


unstack(s(X), s(X)) 


a 

<— 


stack(X,X) 






1.0 


unstack(X, Y) 


b 
<- 


unstack(s 


X) 


,Y) 


1.0 


unstack(s(0), Y) 


c 


unstack(s 


0) 


,s(Y)) 


1.0 


end 


end 

< 


unstack(s 


0) 


,b(0)) 



The LOHMM defines a distribution over {a n b n c n \ n > 0}. 

Finally, the use of logical variables also enables one to deal with identifiers. Identifiers 
are special types of constants that denote objects. Indeed, recall the UNIX command 
sequence emacs lohmms.tex, Is, latex lohmms.tex, . . . from the introduction. The filename 
lohmms.tex is an identifier. Usually, the specific identifiers do not matter but rather the 
fact that the same object occurs multiple times in the sequence. LOHMMs can easily deal 
with identifiers by setting the selection probability fx to a constant for the arguments in 
which identifiers can occur. Unification then takes care of the necessary variable bindings. 

5.2 Benefits of Abstraction through Variables and Unification 

Reconsider the domain of UNIX command sequences. Unix users oftenly reuse a newly cre- 
ated directory in subsequent commands such as inmkdir(vtlOOx), cd(vtlOOx), ls(vtlOOx) . 
Unification should allow us to elegantly employ this information because it allows us to spec- 
ify that, after observing the created directory, the model makes a transition into a state 
where the newly created directory is used: 

pi : cd(Dir, mkdir) <— mkdir(Dir, com) and P2 '■ cd(_, mkdir) <— mkdir(Dir, com) 

If the first transition is followed, the cd command will move to the newly created directory; 
if the second transition is followed, it is not specified which directory cd will move to. Thus, 
the LOHMM captures the reuse of created directories as an argument of future commands. 
Moreover, the LOHMM encodes the simplest possible case to show the benefits of unifica- 
tion. At any time, the observation sequence uniquely determines the state sequence, and 
functors are not used. Therefore, we left out the abstract output symbols associated with 
abstract transitions. In total, the LOHMM U, modelling the reuse of directories, consists 
of 542 parameters only but still covers more than 451000 (ground) states, see Appendix D 
for the complete model. The compression in the number of parameters supports (Bl). 

To empirically investigate the benefits of unification, we compare U with the variant iV 
of U where no variables are shared, i.e., no unification is used such that for instance the 
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first transition above is not allowed, see Appendix D. N has 164 parameters less than U. 
We computed the following zero-one win function 

/(0) = J 1 if [ logP ^°) -log^v(O)] > 
I otherwise 

leave-one-out cross- validated on Unix shell logs collected by Greenberg (1988). Overall, 
the data consists of 168 users of four groups: computer scientists, nonprogrammers, novices 
and others. About 300000 commands have been logged with an average of 110 sessions 
per user. We present here results for a subset of the data. We considered all computer 
scientist sessions in which at least a single mkdir command appears. These yield 283 logical 
sequences over in total 3286 ground atoms. The LOO win was 81.63%. Other LOO statistics 
are also in favor of U: 





training 


test 


logP(O) 


log Pn(0) 


logP(O) 


W Pu \° 
l0g Pn(O) 


u 


-11361.0 


1795.3 


-42.8 


7.91 


N 


-13157.0 


-50.7 



Thus, although U has 164 parameters more than N, it shows a better generalization per- 
formance. This result supports (B2). A pattern often found in U was 1 

0.15 : cd(Dir, mkdir) <— mkdir(Dir, com) and 0.08 : cd(_, mkdir) <— mkdir(Dir, com) 

favoring changing to the directory just made. This knowledge cannot be captured in N 

0.25: cd(_, mkdir) <— mkdir(Dir, com). 

The results clearly show that abstraction through variables and unification can be beneficial 
for some applications, i.e., (Bl) and (B2) hold. 



6. Real World Applications 

Our intentions here are to investigate whether LOHMMs can be applied to real world 
domains. More precisely, we will investigate whether benefits (Bl) and (B2) can also be 
exploited in real world application domains. Additionally, we will investigate whether 

(B3) LOHMMs are competitive with ILP algorithms that can also utilize unification and 
abstraction through variables, and 

(B4) LOHMMs can handle tree-structured data similar to PCFGs. 

To this aim, we conducted experiments on two bioinformatics application domains: protein 
fold recognition (Kersting, Raiko, Kramer, & De Raedt, 2003) and mRNA signal structure 
detection (Horvath, Wrobel, & Bohnebeck, 2001). Both application domains are multiclass 
problems with five different classes each. 

1. The sum of probabilities is not the same (0.15 + 0.08 = 0.23 7^ 0.25) because of the use of pseudo counts 
and because of the subliminal non-determinism (w.r.t. abstract states) in U, i.e., in case that the first 
transition fires, the second one also fires. 
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6.1 Methodology 

In order to tackle the multiclass problem with LOHMMs, we followed a plug-in estimate 
approach. Let {ci,C2, . . . , c&} be the set of possible classes. Given a finite set of training 
examples {(xi, yi)}f =1 Q X x {ci, C2, . . . , c n }, one tries to find / : X — > {c±,C2, • • • , c&} 

f(x) = arg max P(z | M, A*) ■ P(c) . (5) 

ce{ci,C2,...,Cfc} 

with low approximation error on the training data as well as on unseen examples. In 
Equation (5), M denotes the model structure which is the same for all classes, A* denotes 
the maximum likelihood parameters of M for class c estimated on the training examples 
with yi = c only, and P(c) is the prior class distribution. 

We implemented the Baum- Welch algorithm (with pseudocounts m, see line 3) for maxi- 
mum likelihood parameter estimation using the Prolog system Yap-4.4.4. In all experiments, 
we set m = 1 and let the Baum- Welch algorithm stop if the change in log-likelihood was 
less than 0.1 from one iteration to the next. The experiments were ran on a Pentium- IV 

3.2 GHz Linux machine. 

6.2 Protein Fold Recognition 

Protein fold recognition is concerned with how proteins fold in nature, i.e., their three- 
dimensional structures. This is an important problem as the biological functions of proteins 
depend on the way they fold. A common approach is to use database searches to find pro- 
teins (of known fold) similar to a newly discovered protein (of unknown fold). To facilitate 
protein fold recognition, several expert-based classification schemes of proteins have been 
developed that group the current set of known protein structures according to the similarity 
of their folds. For instance, the structural classification of proteins (Hubbard, Murzin, Bren- 
ner, <fe Chotia, 1997) (SCOP) database hierarchically organizes proteins according to their 
structures and evolutionary origin. From a machine learning perspective, SCOP induces a 
classification problem: given a protein of unknown fold, assign it to the best matching group 
of the classification scheme. This protein fold classification problem has been investigated 
by Turcotte, Muggleton, and Sternberg (2001) based on the inductive logic programming 
(ILP) system PROGOL and by Kersting et al. (2003) based on LOHMMs. 

The secondary structure of protein domains 2 can elegantly be represented as logical se- 
quences. For example, the secondary structure of the Ribosomal protein L4 is represented as 

st (null, 2), he(right, alpha, 6), st(plus, 2), he(right, alpha, 4), st(plus, 2), 
he(right, alpha, 4), st(plus, 3), he(right, alpha, 4), st(plus, l), he(hright, alpha, 6) 

Helices of a certain type, orientation and length he(HelixType, HelixOrientation, Length), 
and strands of a certain orientation and length st(StrandOrientation, Length) are atoms over 
logical predicates. The application of traditional HMMs to such sequences requires one to 
either ignore the structure of helices and strands, which results in a loss of information, or to 
take all possible combinations (of arguments such as orientation and length) into account, 
which leads to a combinatorial explosion in the number of parameters 

2. A domain can be viewed as a sub-section of a protein which appears in a number of distantly related 
proteins and which can fold independently of the rest of the protein. 
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Figure 4: Scheme of a left-to-right LOHMM block model. 

The results reported by Kersting et al. (2003) indicate that LOHMMs are well-suited 
for protein fold classification: the number of parameters of a LOHMM can by an order of 
magnitude be smaller than the number of a corresponding HMM (120 versus approximately 
62000) and the generalization performance, a 74% accuracy, is comparable to Turcotte 
et al.'s (2001) result based on the ILP system Progol, a 75% accuracy. Kersting et al. 
(2003), however, do not cross- validate their results nor investigate - as it is common in 
bioinformatics - the impact of primary sequence similarity on the classification accuracy. For 
instance, the two most commonly requested ASTRAL subsets are the subset of sequences 
with less than 95% identity to each other (95 cut) and with less than 40% identity to each 
other (40 cut). Motivated by this, we conducted the following new experiments. 

The data consists of logical sequences of the secondary structure of protein domains. As 
in the work of Kersting et al. (2003), the task is to predict one of the five most populated 
SCOP folds of alpha and beta proteins (a/b): TIM beta/alpha-barrel (fold 1), NAD(P)- 
binding Rossmann-fold domains (fold 2), Ribosomal protein L4 (fold 23), Cysteine hydrolase 
(fold 37), and Phosphotyrosine protein phosphatases I-like (fold 55). The class of a/b 
proteins consists of proteins with mainly parallel beta sheets (beta-alpha-beta units). The 
data have been extracted automatically from the ASTRAL dataset version 1.65 (Chandonia, 
Hon, Walker, Lo Conte, P.Koehl, & Brenner, 2004) for the 95 cut and for the 40 cut. As 
in the work of Kersting et al. (2003), we consider strands and helices only, i.e., coils and 
isolated strands are discarded. For the 95 cut, this yields 816 logical sequences consisting 
of in total 22210 ground atoms. The number of sequences in the classes are listed as 293, 
151, 87, 195, and 90. For the 40 cut, this yields 523 logical sequences consisting of in total 
14986 ground atoms. The number of sequences in the classes are listed as 182, 100, 66, 122, 
and 53. 



LOHMM structure: The used LOHMM structure follows a left-to-right block topology, 
see Figure 4, to model blocks of consecutive helices (resp. strands). Being in a Block of 
some size s, say 3, the model will remain in the same block for s = 3 time steps. A similar 
idea has been used to model haplotypes (Koivisto, Perola, Varilo, Hennah, Ekelund, Lukk, 
Peltonen, Ukkonen, &: Mannila, 2002; Koivisto, Kivioja, Mannila, Rastas, & Ukkonen, 
2004). In contrast to common HMM block models (Won, Priigel-Bennett, & Krogh, 2004), 
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the transition parameters are shared within each block and one can ensure that the model 
makes a transition to the next state s(Block) only at the end of a block; in our example 
after exactly 3 intra-block transitions. Furthermore, there are specific abstract transitions 
for all helix types and strand orientations to model the priori distribution, the intra- and 
the inter-block transitions. The number of blocks and their sizes were chosen according 
to the empirical distribution over sequence lengths in the data so that the beginning and 
the ending of protein domains was likely captured in detail. This yield the following block 
structure 

□□ ••• n i i n ••• n i 11 n i n 

12 19 20 27 28 40 41 46 47 61 62 76 77 

where the numbers denote the positions within protein domains. Furthermore, note that 
the last block gathers all remaining transitions. The blocks themselves are modelled using 
hidden abstract states over 

hc(HelixType, HelixOrientation, Length, Block) and sc(StrandOrientation, Length, Block) . 

Here, Length denotes the number of consecutive bases the structure element consists of. 
The length was discretized into 10 bins such that the original lengths were uniformally 
distributed. In total, the LOHMM has 295 parameters. The corresponding HMM without 
parameter sharing has more than 65200 parameters. This clearly confirms (Bl). 

Results: We performed a 10- fold cross-validation. On the 95 cut dataset, the accuracy was 
76% and took approx. 25 minutes per cross-validation iteration; on the 40 cut, the accuracy 
was 73% and took approx. 12 minutes per cross-validation iteration. The results validate 
Kersting et al.'s (2003) results and, in turn, clearly show that (B3) holds. Moreover, the 
novel results on the 40 cut dataset indicate that the similarities detected by the LOHMMs 
between the protein domain structures were not accompanied by high sequence similarity. 

6.3 mRNA Signal Structure Detection 

mRNA sequences consist of bases (guanine, adenine, uracil, cytosine) and fold intramolec- 
ularly to form a number of short base-paired stems (Durbin, Eddy, Krogh, <fc Mitchison, 
1998). This base-paired structure is called the secondary structure, cf. Figures 5 and 6. The 
secondary structure contains special subsequences called signal structures that are responsi- 
ble for special biological functions, such as RNA-protein interactions and cellular transport. 
The function of each signal structure class is based on the common characteristic binding 
site of all class elements. The elements are not necessarily identical but very similar. They 
can vary in topology (tree structure), in size (number of constituting bases), and in base 
sequence. 

The goal of our experiments was to recognize instances of signal structures classes in 
mRNA molecules. The first application of relational learning to recognize the signal struc- 
ture class of mRNA molecules was described in the works of Bohnebeck, Horvath, and 
Wrobel (1998) and of Horvath et al. (2001), where the relational instance-based learner 
RIBL was applied. The dataset 3 we used was similar to the one described by Horvath 

3. The dataset is not the same as described in the work by Horvath et al. (2001) because we could not obtain 
the original dataset. We will compare to the smaller data set used by Horvath et al., which consisted of 
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et al. (2001). It consisted of 93 mRNA secondary structure sequences. More precisely, it was 
composed of 15 and 5 SECIS (Selenocysteine Insertion Sequence), 27 IRE (Iron Responsive 
Element), 36 TAR (Trans Activating Region) and 10 histone stem loops constituting five 
classes. 

The secondary structure is composed of different building blocks such as stacking region, 
hairpin loops, interior loops etc. In contrast to the secondary structure of proteins that forms 
chains, the secondary structure of mRNA forms a tree. As trees can not easily be handled 
using HMMs, mRNA secondary structure data is more challenging than that of proteins. 
Moreover, Horvath et al. (2001) report that making the tree structure available to RIBL 
as background knowledge had an influence on the classification accuracy. More precisely, 
using a simple chain representation RIBL achieved a 77.2% leave-one-out cross-validation 
(LOO) accuracy whereas using the tree structure as background knowledge RIBL achieved 
a 95.4% LOO accuracy. 

We followed Horvath et al.'s experimental setup, that is, we adapted their data repre- 
sentations to LOHMMs and compared a chain model with a tree model. 



Chain Representation: In the chain representation (see also Figure 5), 
signal structures are described by single(TypeSingle, Position, Acid) or 
helic&l(TypeHelical, Position, Acid, Acid). Depending on its type, a structure el- 
ement is represented by either single/3 or helical/4. Their first argument 
TypeSingle (resp. TypeHelical) specifies the type of the structure element, i.e., 
single, bulge3, bulge5, hairpin (resp. stem). The argument Position is the posi- 
tion of the sequence element within the corresponding structure element counted down, 
i.e. 4 , {n 13 (0), n 12 (0), . . . ,n 1 (0)}. The maximal position was set to 13 as this was the 
maximal position observed in the data. The last argument encodes the observed nucleotide 
(pair). 

The used LOHMM structure follows again the left-to-right block structure shown in 
Figure 4. Its underlying idea is to model blocks of consecutive helical structure ele- 
ments. The hidden states are modelled using single( TypeSingle, Position, Acid, Block) 
and helica.l(TypeHelical, Position, Acid, Acid, Block). Being in a Block of consecutive he- 
lical (resp. single) structure elements, the model will remain in the Block or transition to a 
single element. The transition to a single (resp. helical) element only occurs at Position 
n(0). At all other positions n(Position), there were transitions from helical (resp. single) 
structure elements to helical (resp. single) structure elements at Position capturing the dy- 
namics of the nucleotide pairs (resp. nucleotides) within structure elements. For instance, 



66 signal structures and is very close to our data set. On a larger data set (with 400 structures) Horvath 
et al. report an error rate of 3.8% . 
4. n m (0) is shorthand for the recursive application of the functor n on m times, i.e., for position m. 
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helical(stem, n(0), c, g 
helical(stem, n(n(0)), c 
helical(stem, n(n(n(0))), c, g 

single{bulge5, n(0), a 
single(bulge5, n(n(0)) 
single(bulge5, n(n(n(0))), g 

helical(stem. n(0), c, g 
helical(stem, n(n(0)), c, g 

single(bulge5, n(0), a 

helical(stem, n(0) , a. a 
helical (stem, n(n(0) ) , u, a 
helical(stem, n(n(n(0))), u, g 
helical(stem, n(n(n(n(0)))), u, 
helical(stem, n(n(n(n(n(0))))) , c, 
helical(stem 1 n(n(n(n(n(n(0)))))), u, 
helical(stem, n(n(n(n(n(n(n(0))))))) , a, 



single{hairpin, n(n(n(0))J, a). 
single(hairpin, n{n(0)),u). 
single{hairpin, n(0), uj . 



single(bulge3, n{0), a). 



Figure 5: The chain representation of a SECIS signal structure. The ground atoms are 
ordered clockwise starting with helical(stem, n(n(n(n(n(n(n(0))))))), a, u) at the 
lower left-hand side corner. 



the transitions for block n(0) at posit 

a : he(stem, n(0), X, Y, n(0)) 

b: he(stem,n(0),Y,X,n(0)) 

c: he(stem, n(0), X, _, n(0)) 

d: he(stem, n(0), _, Y, n(0)) 

e: he(stem, n(0), _, _, n(0)) 



on n(n(0)) were 

p a :he(stem,n(0),X,Y) 



P(,:he(stem,n(0),X,Y) 



p c :he(stem,n(0),X,Y) 



p d :he(stem,n(0),X,Y) 



p e :he(stem,n(0),X,Y) 



he(stem, n(n(0)), X, Y, n(0))) 

he(stem, n(n(0)), X, Y, n(0))) 

he(stem, n(n(0)), X, Y, n(0))) 

he(stem, n(n(0)), X, Y, n(0))) 

he(stem, n(n(0)), X, Y, n(0))) 

In total, there were 5 possible blocks as this was the maximal number of blocks of consecutive 
helical structure elements observed in the data. Overall, the LOHMM has 702 parameters. 
In contrast, the corresponding HMM has more than 16600 transitions validating (Bl). 

Results: The LOO test log-likelihood was —63.7, and an EM iteration took on average 
26 seconds. 

Without the unification-based transitions b-d, i.e., using only the abstract transitions 

i / \ , p a :he(stem,n(0),X,Y) . , , . ... 

a: he(stem,n(0),X,Y,n(0)) 4- he(stem, n(n(0)), X, Y, n(0))) 



e : 



he(stem, n(0), _, _, n(0)) 



p e :he(stem,n(0),X,Y) 



he(stem, n(n(0)), X, Y, n(0))), 



the model has 506 parameters. The LOO test log-likelihood was —64.21, and an EM iter- 
ation took on average 20 seconds. The difference in LOO test log-likelihood is statistically 
significant (paired t-test, p = 0.01). 

Omitting even transition a, the LOO test log-likelihood dropped to —66.06, and the 
average time per EM iteration was 18 seconds. The model has 341 parameters. The 
difference in average LOO log-likelihood is statistically significant (paired i-test, p = 0.001). 

The results clearly show that unification can yield better LOO test log-likelihoods, i.e., 
(B2) holds. 
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nucleotide_pair((c, g)) 
nucleotide_pair((c, g)) 
nucleotide_pair((c, g)). 
helical(s(s(s(s(s(0))))), s(s(s(0))), [c],stem, n(n(n(0) ) )) 

nucleotide(a) 
nucleotide(a) 
nucleotide(g) . 

single(s(s(s{s{0)))) ; s(s{s{0))), Q,bulge5, n(n(n(0)))) 
nucleotide_pair((c, g)) 
nucleotide_pair((c, g)). 
helical(s(s(s(0))), s(0), [c, c, c] , stem, n(n(0))) 
nucleotide(a) . 
single(s{s(0)), s{0), [] , bulge5, n(0)) 
nucleotide_pair( (a, a)) 
nucleotide_pair((u, a)) 
nucleotide_pair((u, g)) 
nucleotide_pair((u, a)) 
nucleotide_pair((c, a)) 
nucleotide_pair((u, a)) 
nucleotide_pair( (a, u)) 
helical(s(0), 0, [c, c] , stem, n(n(n(n(n(n(n(0j J J J J J J J 
root(0, root, [c] J 



j- 



single( S (a( S ( S ( S ( a (0)))))), s(s(s(s(s(0))))), 
[] , hairpin, n(n(n(0)))). 

nucleotide(a) . 
nucleotide(u) . 
nucleotide(u) . 



single(s(s(s(a(s(s(s(0))))))), s(s(s(0))), 
[],bulge3, n(0)). 

nucleotide(a) . 







b(0) 

/ \ 

b(b(0))s(b(s(0))) 

s(s(s(s(0)))) / I X s(s(s(s(s(s(s(0))))))) 
b(b(b(b(b(0)))) 



bCb(b(b(b(b(0)))))) 



Figure 6: The tree representation of a SECIS signal structure, (a) The logical sequence, 
i.e., the sequence of ground atoms representing the SECIS signal structure. The 
ground atoms are ordered clockwise starting with root(0, root, [c]) in the lower 
left-hand side corner, (b) The tree formed by the secondary structure elements. 



Tree Representation: In the tree representation (see Figure 6 (a)), the idea is to capture 
the tree structure formed by the secondary structure elements, see Figure 6 (b). Each 
training instance is described as a sequence of ground facts over 

root(0, root, # Children), 

helical(ZD, ParentID, ^Children, Type, Size), 

nucleotide_pair (BasePair), 

single (ZD, ParentID, # Children, Type, Size), 

nucleotide(5ase) . 

Here, ID and ParentID are natural numbers 0, s(0), s(s(0)), . . . encoding the child- 
parent relation, ^Children denotes the number 5 of children [], [c], [c, c], . . ., Type is the 
type of the structure element such as stem, hairpin, .. ., and Size is a natural number 
0, n(0), n(n(0)), . . . Atoms root (0, root, ^Children) are used to root the topology. The 
maximal # Children was 9 and the maximal Size was 13 as this was the maximal value 
observed in the data. 

As trees can not easily be handled using HMMs, we used a LOHMM which basically 
encodes a PCFG. Due to Theorem 2, this is possible. The used LOHMM structure can be 
found in Appendix E. It processes the mRNA trees in in-order. Unification is only used for 
parsing the tree. As for the chain representation, we used a Position argument in the hidden 
states to encode the dynamics of nucleotides (nucleotide pairs) within secondary structure 

5. Here, we use the Prolog short hand notation [■] for lists. A list either is the constant [] representing the 
empty list, or is a compound term with functor ./2 and two arguments, which are respectively the head 
and tail of the list. Thus [a, b, c] is the compound term .(a, .(b, .(c, []))). 
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elements. The maximal Position was again 13. In contrast to the chain representation, 
nucleotide pairs such as (a, u) are treated as constants. Thus, the argument BasePair 
consists of 16 elements. 

Results: The LOO test log-likelihood was —55.56. Thus, exploiting the tree structure 
yields better probabilistic models. On average, an EM iteration took 14 seconds. Overall, 
the result shows that (B4) holds. 

Although the Baum- Welch algorithm attempts to maximize a different objective func- 
tion, namely the likelihood of the data, it is interesting to compare LOHMMs and RIBL in 
terms of classification accuracy. 

Classification Accuracy: On the chain representation, the LOO accuracies of all 
LOHMMs were 99% (92/93). This is a considerable improvement on RIBL's 77.2% (51/66) 
LOO accuracy for this representation. On the tree representation, the LOHMM also 
achieved a LOO accuracy of 99% (92/93). This is comparable to RIBL's LOO accuracy of 
97% (64/66) on this kind of representation. 

Thus, already the chain LOHMMs show marked increases in LOO accuracy when com- 
pared to RIBL (Horvath et al., 2001). In order to achieve similar LOO accuracies, Horvath 
et al. (2001) had to make the tree structure available to RIBL as background knowledge. 
For LOHMMs, this had a significant influence on the LOO test log- likelihood, but not on 
the LOO accuracies. This clearly supports (B3). Moreover, according to Horvath et al., 
the mRNA application can also be considered a success in terms of the application domain, 
although this was not the primary goal of our experiments. There exist also alternative 
parameter estimation techniques and other models, such as covariance models (Eddy & 
Durbin, 1994) or pair hidden Markov models (Sakakibara, 2003), that might have been 
used as well as a basis for comparison. However, as LOHMMs employ (inductive) logic pro- 
gramming principles, it is appropriate to compare with other systems within this paradigm 
such as RIBL. 

7. Related Work 

LOHMMs combine two different research directions. On the one hand, they are related to 
several extensions of HMMs and probabilistic grammars. On the other hand, they are also 
related to the recent interest in combining inductive logic programming principles with 
probability theory (De Raedt & Kersting, 2003, 2004). 

In the first type of approaches, the underlying idea is to upgrade HMMs and probabilistic 
grammars to represent more structured state spaces. 

Hierarchical HMMs (Fine, Singer, & Tishby, 1998), factorial HMMs (Ghahramani & 
Jordan, 1997), and HMMs based on tree automata (Frasconi, Soda, & Vullo, 2002) decom- 
pose the state variables into smaller units. In hierarchical HMMs states themselves can be 
HMMs, in factorial HMMs they can be factored into k state variables which depend on one 
another only through the observation, and in tree based HMMs the represented probability 
distributions are defined over tree structures. The key difference with LOHMMs is that 
these approaches do not employ the logical concept of unification. Unification is essential 
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because it allows us to introduce abstract transitions, which do not consist of more detailed 
states. As our experimental evidence shows, sharing information among abstract states by 
means of unification can lead to more accurate model estimation. The same holds for re- 
lational Markov models (RMMs) (Anderson, Domingos, & Weld, 2002) to which LOHMMs 
are most closely related. In RMMs, states can be of different types, with each type described 
by a different set of variables. The domain of each variable can be hierarchically structured. 
The main differences between LOHMMs and RMMs are that RMMs do not either support 
variable binding nor unification nor hidden states. 

The equivalent of HMMs for context-free languages are probabilistic context-free gram- 
mars (PCFGs). Like HMMs, they do not consider sequences of logical atoms and do not 
employ unification. Nevertheless, there is a formal resemblance between the Baum- Welch 
algorithms for LOHMMs and for PCFGs. In case that a LOHMM encodes a PCFG both 
algorithms are identical from a theoretical point of view. They re-estimate the parameters 
as the ratio of the expected number of times a transition (resp. production) is used and the 
expected number of times a transition (resp. production) might have been used. The proof 
of Theorem 2 assumes that the PCFG is given in Greibach normal form 6 (GNF) and uses a 
pushdown automaton to parse sentences. For grammars in GNF, pushdown automata are 
common for parsing. In contrast, the actual computations of the Baum- Welch algorithm 
for PCFGs, the so called Inside-Outside algorithm (Baker, 1979; Lari & Young, 1990), is 
usually formulated for grammars in Chomsky normal form 7 . The Inside-Outside algorithm 
can make use of the efficient CYK algorithm (Hopcroft & Ullman, 1979) for parsing strings. 

An alternative to learning PCFGs from strings only is to learn from more structured data 
such as skeletons, which are derivation trees with the nonterminal nodes removed (Levy & 
Joshi, 1978). Skeletons are exactly the set of trees accepted by skeletal tree automata (STA). 
Informally, an STA, when given a tree as input, processes the tree bottom up, assigning a 
state to each node based on the states of that node's children. The STA accepts a tree iff 
it assigns a final state to the root of the tree. Due to this automata-based characterization 
of the skeletons of derivation trees, the learning problem of (P)CFGs can be reduced to 
the problem of an STA. In particular, STA techniques have been adapted to learning tree 
grammars and (P)CFGs (Sakakibara, 1992; Sakakibara et al., 1994) efficiently. 

PCFGs have been extended in several ways. Most closely related to LOHMMs are 
unification-based grammars which have been extensively studied in computational linguis- 
tics. Examples are (stochastic) attribute- value grammars (Abney, 1997), probabilistic fea- 
ture grammars (Goodman, 1997), head-driven phrase structure grammars (Pollard & Sag, 
1994), and lexical- functional grammars (Bresnan, 2001). For learning within such frame- 
works, methods from undirected graphical models are used; see the work of Johnson (2003) 
for a description of some recent work. The key difference to LOHMMs is that only nonter- 
minals are replaced with structured, more complex entities. Thus, observation sequences of 
flat symbols and not of atoms are modelled. Goodman's probabilistic feature grammars are 
an exception. They treat terminals and nonterminals as vectors of features. No abstraction 
is made, i.e., the feature vectors are ground instances, and no unification can be employed. 

6. A grammar is in GNF iff ali productions are of the form A aV where A is a variable, a is exactly one 
terminal and V is a string of none or more variables. 

7. A grammar is in CNF iff every production is of the form A <— B, C or A ^— a where A, B and C are variables, 
and a is a terminal. 
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Figure 7: (a) Each atom in the logical sequence mkdir(vtlOOx), mv(new*, vtlOOx), 
Is (vtlOOx), cd(vtlOOx) forms a tree. The shaded nodes denote shared labels 
among the trees, (b) The same sequence represented as a single tree. The pred- 
icate con/2 represents the concatenation operator. 



Therefore, the number of parameters that needs to be estimated becomes easily very large, 
data sparsity is a serious problem. Goodman applied smoothing to overcome the problem. 

LOHMMs are generally related to (stochastic) tree automata (see e.g., Car- 
rasco, Oncina, and Calera-Rubio, 2001). Reconsider the Unix command sequence 
mkdir(vtlOOx), mv(new*, vtlOOx), ls(vtlOOx), cd(vtlOOx) . Each atom forms a tree, see 
Figure 7 (a), and, indeed, the whole sequence of atoms also forms a (degenerated) tree, 
see Figure 7 (b). Tree automata process single trees vertically, e.g., bottom-up. A state in 
the automaton is assigned to every node in the tree. The state depends on the node label 
and on the states associated to the siblings of the node. They do not focus on sequential 
domains. In contrast, LOHMMs are intended for learning in sequential domains. They 
process sequences of trees horizontally, i.e., from left to right. Furthermore, unification 
is used to share information between consecutive sequence elements. As Figure 7 (b) 
illustrates, tree automata can only employ this information when allowing higher-order 
transitions, i.e., states depend on their node labels and on the states associated to 
predecessors 1,2,... levels down the tree. 

In the second type of approaches, most attention has been devoted to developing highly 
expressive formalisms, such as e.g. PCUP (Eisele, 1994), PCLP (Riezler, 1998), SLPs (Mug- 
gleton, 1996), PLPs (Ngo & Haddawy, 1997), RBNs (Jaeger, 1997), PRMs (Friedman, 
Getoor, Roller, & Pfeffer, 1999), PRISM (Sato & Kameya, 2001), BLPs (Kersting & De 
Raedt, 2001b, 2001a), and DPRMs (Sanghai, Domingos, & Weld, 2003). LOHMMs can be 
seen as an attempt towards downgrading such highly expressive frameworks. Indeed, apply- 
ing the main idea underlying LOHMMs to non-regular probabilistic grammar, i.e., replacing 
flat symbols with atoms, yields - in principle - stochastic logic programs (Muggleton, 1996). 
As a consequence, LOHMMs represent an interesting position on the expressiveness scale. 
Whereas they retain most of the essential logical features of the more expressive formalisms, 
they seem easier to understand, adapt and learn. This is akin to many contemporary consid- 
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erations in inductive logic programming (Muggleton & De Raedt, 1994) and multi-relational 
data mining (Dzeroski & Lavrac, 2001). 

8. Conclusions 

Logical hidden Markov models, a new formalism for representing probability distributions 
over sequences of logical atoms, have been introduced and solutions to the three central 
inference problems (evaluation, most likely state sequence and parameter estimation) have 
been provided. Experiments have demonstrated that unification can improve generalization 
accuracy, that the number of parameters of a LOHMM can be an order of magnitude smaller 
than the number of parameters of the corresponding HMM, that the solutions presented 
perform well in practice and also that LOHMMs possess several advantages over traditional 
HMMs for applications involving structured sequences. 
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Appendix A. Proof of Theorem 1 

Let M = A, Y) be a LOHMM. To show that M specifies a time discrete stochastic 

process, i.e., a sequence of random variables (At)t=i,2,...) where the domains of the random 
variable Xt is hb(E), the Herbrand base over S, we define the immediate state operator 
T/vf -operator and the current emission operator £?M-operator. 

Definition 4 (Tm- Operator, Em -Operator ) The operators Tm '■ 2 hbE — > 2 hbs and Em '■ 
2 hb E _^ 2 hb E are 

T M {I) = {Ho-bo-h | 3(p : H £- B) G M : Ba B G /, Ha B a H € G E (ff)} 

E M {I) = {Oa B a H a \3(p:H £ B) € M : Ba B £ I, Ha B a G G G S (H) 
and Oo- b oho~o G G^(0)} 

For each i = 1,2,3,..., the set T^ 1 ({start}) := T M (T l M ({start})) with 
TjL({start}) := Tjv/({start}) specifies the state set at clock i which forms a random vari- 
able Yi. The set [/^({start}) specifies the possible symbols emitted when transitioning 
from i to % + 1. It forms the variable Ui. Each Yi (resp. Ui) can be extended to a random 
variable Z{ (resp. Ui) over hbs: 




otherwise 




447 



Kersting, De Raedt, & Raiko 




Figure 8: Discrete time stochastic process induced by a LOHMM. The nodes Zi and Ui 
represent random variables over hb^. 



Figure 8 depicts the influence relation among Z% and Ui. Using standard arguments from 
probability theory and noting that 

p(u = u i z i+1 = Zi+1 , Zi = Zi) = p{ % 1 = «; M = *\ Zi) 

T, Ul P{z i+ i,ui | Zi) 

and P{Z l+ i | Zi) = S ^P(Z i+ i,u i \ Zi) 

Ui 

where the probability distributions are due to equation (1), it is easy to show that Kol- 
mogorov's extension theorem (see Bauer, 1991; Fristedt and Gray, 1997) holds. Thus, M 
specifies a unique probability distribution over (£) i=1 (Zi x U) for each t > and in the 
limit t — > oo. □ 



Appendix B. Moore Representations of LOHMMs 

For HMMs, Moore representations, i.e., output symbols are associated with states and Mealy 
representations, i.e., output symbols are associated with transitions, are equivalent. In this 
appendix, we will investigate to which extend this also holds for LOHMMs. 

Let L be a Mealy-LOHMM according to definition 3. In the following, we will derive 
the notation of an equivalent LOHMM L' in Moore representation where there are abstract 
transitions and abstract emissions (see below). Each predicate b/n in L is extended to b/n+ 
1 in L'. The domains of the first n arguments are the same as for b/n. The last argument 
will store the observation to be emitted. More precisely, for each abstract transition 

, / \ o(vi,...,v k ) . . 

p : h(wi, . . . , wij < b(ui, . . . ,u n ) 



in L, there is an abstract transition 



p : h(wi, . . . , wi, o(v' l5 . . . , vk)) <- b(ui, ...,u n ,_) 

in L' . The primes in o(v' 1 , . . . , v£) denote that we replaced each free 8 variables o(vi, . . . , v k ) 
by some distinguished constant symbol, say Due to this, it holds that 

/z(h(wi,...,wi)) =/x(h(wi,...,wi,o(vi,...,v' k ))) , (6) 



8. A variable X £ vars(o(vi, . . . , v k )) is free iff X vars(h(wi, . . . , wi)) U vars(b(ui, . . . , u n )). 
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and L h s output distribution can be specified using abstract emissions which are expressions 
of the form 

1.0 : o(vi, . . . ,v k ) 4- h(wi, . . . ,wi,o(vi, . . . ,v k )) . (7) 

The semantics of an abstract transition in L' is that being in some 
state S' 4 G Gs'(b(ui, . . . ,u n , _)) the system will make a transition into state 
S' t+1 € Gs'(h(wi, . . . ,wi, o(v' 1 , . . . , v k ))) with probability 

P-K s 't+i I h(w 1 ,...,w 1 ,o(v' 1 ,...,v k )) | <7gj) (8) 

where a s > = mgu(S / i , b(ui, . . . , u n , _)). Due to Equation (6), Equation (8) can be rewritten 
as 

P-v(S't+i I h(wi,...,Wi) | cr s /) . 

Due to equation (7), the system will emit the output symbol ot+i £ Gxy(o(vi, . . . , v k )) in 
state S' t+1 with probability 

fi(o t+ i | o(v!,...,v k )cr s / +j CT s /) 

where a s / i = mgu(h(wi, . . . , w l5 ofV^ . . . , v k )), S' t+J ). Due to the construction of L', there 
exists a triple (St, S t+ i , Ot+i) m L f° r each triple (S' t ,S' t+1 ,0 t+ i), t > 0, in V (and vise 
versa). Hence, both LOHMMs assign the same overall transition probability. 

L and L 1 differ only in the way the initialize sequences {(S' tl S' t+1 ,O t +i)t=o,2...,T (resp. 
((St, St+i , O t +i)j=o,2...,T)- Whereas L starts in some state So and makes a transition to Si 
emitting 0i, the Moore-LOHMM L' is supposed to emit a symbol Oo in S before making a 
transition to S^. We compensate for this using the prior distribution. The existence of the 
correct prior distribution for L' can be seen as follows. In L, there are only finitely many 
states reachable at time t = 1, i.e, Pl(qo = S) > holds for only a finite set of ground 
states S. The probability Pl(qo = s ) can be computed similar to «i(S). We set t = 1 in line 
6, neglecting the condition on Ot-i in line 10, and dropping /x(0t-i | 0o" b CTh) from line 14. 
Completely listing all states S € Si together with Pl{% = S), i.e., Pl{o.q = S) : S <f— start , 
constitutes the prior distribution of L' . 

The argumentation basically followed the approach to transform a Mealy machine into 
a Moore machine (see e.g., Hopcroft and Ullman, 1979). Furthermore, the mapping of a 
Moore-LOHMM - as introduced in the present section - into a Mealy-LOHMM is straight- 
forward. 

Appendix C. Proof of Theorem 2 

Let T be a terminal alphabet and iV a nonterminal alphabet. A probabilistic context-free 
grammar (PCFG) G consists of a distinguished start symbol S G N plus a finite set of 
productions of the form p : X — > a, where X € N, a G (N U T)* and p € [0, 1]. For all 
X S N, = L A PCFG defines a stochastic process with sentential forms as states, 

and leftmost rewriting steps as transitions. We denote a single rewriting operation of the 
grammar by a single arrow — >. If as a result of one ore more rewriting operations we are 
able to rewrite /3 € (N U T)* as a sequence 7 € (N U T)* of nonterminals and terminals, 
then we write /3 7. The probability of this rewriting is the product of all probability 
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values associated to productions used in the derivation. We assume G to be consistent, i.e., 
that the sum of all probabilities of derivations S =>* /3 such that f3 € T* sum to 1.0. 

We can assume that the PCFG G is in Greibach normal form. This follows from Abney 
et al.'s (1999) Theorem 6 because G is consistent. Thus, every production P G G is of 
the form p : X — > aY\ . . . Y n for some n > 0. In order to encode G as a LOHMM M, we 
introduce (1) for each non-terminal symbol X in G a constant symbol nX and (2) for each 
terminal symbol t in G a constant symbol t. For each production P € G, we include an 
abstract transition of the form p : stack([nYi, . . . ,nY n |S]) A- stack([nX|S]), if n > 0, and 
p : stack(S) A stack([nX|S]), if n = 0. Furthermore, we include 1.0 : stack([s]) <(— start 
and 1.0 : end <r^— stack([]). It is now straightforward to prove by induction that M and G 
are equivalent. □ 

Appendix D. Logical Hidden Markov Model for Unix Command 
Sequences 

The LOHMMs described below model Unix command sequences triggered by mkdir. To 
this aim, we transformed the original Greenberg data into a sequence of logical atoms over 
com, mkdir(Dir, LastCom), ls(Dir, LastCom), cd(Dir, Dir, LastCom), cp(Dir, Dir, LastCom) 
and mv(Dir, Dir, LastCom). The domain of LastCom was {start, com, mkdir, Is, cd, cp, mv}. 
The domain of Dir consisted of all argument entries for mkdir, Is, cd, cp, mv in the original 
dataset. Switches, pipes, etc. were neglected, and paths were made absolute. This yields 
212 constants in the domain of Dir. All original commands, which were not mkdir, Is, cd, 
cp, or mv, were represented as com. If mkdir did not appear within 10 time steps before a 
command C £ {is, cd, cp,mv}, C was represented as com. Overall, this yields more than 
451000 ground states that have to be covered by a Markov model. 

The "unification" LOHMM U basically implements a second order Markov model, i.e., 
the probability of making a transition depends upon the current state and the previous 
state. It has 542 parameters and the following structure: 




mkdir(Dir, com) ^— com. 

end ^— com. 



com <— 



com. 



Furthermore, for each C € {start, com} there are 




cp(_, _, mkdir) mkdir(Dir,C). 
mv(_, Dir, mkdir) <(— mkdir (Dir, C). 
mv(Dir, _, mkdir) ^— mkdir(Dir,C). 



cd(_, mkdir) ^— mkdir(Dir,C). 
cp(_, Dir, mkdir) ^— mkdir(Dir,C). 
cp(Dir, _, mkdir) <(— mkdir(Dir,C). 



ls(Dir, mkdir) mkdir(Dir,C 



ls(_, mkdir) ^— mkdir(Dir,C 
cd(Dir, mkdir) ^— mkdir(Dir,C 



mv(_, _, mkdir) ^— mkdir(Dir,C). 
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together with for each C € {mkdir, Is, cd, cp,mv} and for each C\ € {cd, Is} (resp. 
C 2 € {cp,mv}) 



mkdir(Dir, com) 
mkdir(_, com) 
com 
end 

ls(DirA) 

ls(-A) 
cd(Dir,Ci) 

cd(_,d) 

cp(_,Dir,Ci) 

cp(Dir,_,Ci) 

cp(-,-A) 
mv(_, Dir,Ci) 
mv(Dir, _,Ci) 

mv(_, _,Ci) 



Ci(Dir,C 
Ci(Dir,C 
Ci(Dir,C 
Ci(Dir,C 
Ci(Dir,C 
Ci(Dir,C 
Ci(Dir,C 
Ci(Dir,C 
Ci(Dir,C 
Ci(Dir,C 
Ci(Dir,C 
Ci(Dir,C 
Ci(Dir,C 
Ci(Dir,C 



mkdir(_, com) 
com 
end 

ls(From,C2) 
ls(To,C 2 ) 
ls(-,C 2 ) 
cd(From,C2) 
cd(To,C 2 ) 
cd(_,C 2 ) 
cp(From, _,C2) 
cp(_, To,C 2 ) 
cp(_, _,C 2 ) 
mv(From, _,C 2 ) 
mv(_, To,C 2 ) 
mv(_, _,C 2 ) 



C 2 (From 
C 2 (From 
C 2 (From 
C 2 (From 
C 2 (From 
C 2 (From 
C 2 (From 
C 2 (From 
C 2 (From 
C 2 (From 
C 2 (From 
C 2 (From 
C 2 (From 
C 2 (From 
C 2 (From 



To,C). 
To,C). 
To,C). 
To,C). 
To,C). 
To,C). 
To,C). 
To,C). 
To,C). 
To,C). 
To,C). 
To,C). 
To,C). 
To,C). 
To,C). 



Because all states are fully observable, we omitted the output symbols associated with 
clauses, and, for the sake of simplicity, we omitted associated probability values. 

The "no unification" LOHMM N is the variant of U where no variables were shared 
such as 



mkdir(_, com) cp(From, To,C). 

com 4— cp(From, To,C). 
end <— cp(From, To,C). 



ls(_, cp) <— cp(From, To,C). 

cd(_, cp) <— cp(From, To,C). 

cp(_, _, cp) <— cp(From, To,C). 

mv(_, _, cp) ^— cp(From, To,C). 



Because only transitions are affected, N has 164 parameters less than U, i.e., 378. 



Appendix E. Tree-based LOHMM for mRNA Sequences 

The LOHMM processes the nodes of mRNA trees in in-order. The structure of the LOHMM 
is shown at the end of the section. There are copies of the shaded parts. Terms are 
abbreviated using their starting alphanumerical; tr stands for tree, he for helical, si for 
single, nuc for nucleotide, and nuc_p for nucleotide_pair. 

The domain of # Children covers the maximal branching factor found in the data, i.e., 
{[c], [c, c], . . . , [c, c, c, c, c, c, c, c, c]}; the domain of Type consists of all types occurring in 
the data, i.e., {stem, single, bulge3, bulge5, hairpin}; and for Size, the domain covers 
the maximal length of a secondary structure element in the data, i.e., the longest sequence 
of consecutive bases respectively base pairs constituting a secondary structure element. 
The length was encoded as {n 1 (0), n 2 (0), . . . , n 13 (0)} where n m (0) denotes the recursive 
application of the functor n m times. For Base and BasePair, the domains were the 4 bases 
respectively the 16 base pairs. In total, there are 491 parameters. 
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